-
Notifications
You must be signed in to change notification settings - Fork 2
builder
The class EvenRegularGraphBuilder is used to construct search graphs.
class deglib.builder.EvenRegularGraphBuilder(graph: MutableGraph, rng: Mt19937 | None = None, optimization_target: OptimizationTarget = OptimizationTarget.LowLID, extend_k: int = 0, extend_eps: float = 0.1, improve_k: int = 0, improve_eps: float = 0.001, max_path_length: int = 5, swap_tries: int = 0, additional_swap_tries: int = 0)
Bases: object
Constructs an EvenRegularGraphBuilder for building and optimizing a regular graph.
This class provides functionality to incrementally build and optimize a graph structure by adding/removing vertices and improving edge connections through various strategies.
Provide the builder with new entries to be added to the graph during the build process.
Can add a single entry or a batch of entries. For batch processing, the feature array should have shape [N, D] where N is the number of entries and D is the dimensionality.
-
Parameters:
- external_label – The label(s) that name the added vertex/vertices. For batch processing, should be a list or array with length N matching the number of features
- feature – The feature vector(s) to be added. Shape [D] for single entry or [N, D] for batch
-
Raises:
- InvalidShapeException – If feature array has invalid shape (not 1D or 2D)
- AssertionError – If number of features doesn’t match number of labels in batch processing
Command the builder to remove a vertex from the graph as quickly as possible.
- Parameters: external_label (int) – The external label that identifies the graph vertex to remove
Get the number of entries that will be added to the graph.
- Returns: Number of entries queued for addition
- Return type: int
Get the number of entries that will be removed from the graph.
- Returns: Number of entries queued for removal
- Return type: int
Set the number of threads used to extend the graph during building.
When the thread count is greater than 1 and the optimization target is not StreamingData, the builder will utilize multiple threads to add elements to the graph in parallel. By default, all available CPU cores/threads are used unless specified. Note: The order in which elements are added is not guaranteed when using multiple threads.
- Parameters: thread_count (int) – Number of threads to use for graph extension
Set the batch size parameters for parallel graph construction.
The builder processes elements in batches to minimize synchronization between threads. The total batch size is calculated as: batch_size = thread_count * tasks_per_batch * task_size
Effects of thread count and batch size: : - thread_count = 1 and batch_size = 1: low throughput, medium latency, order of elements is guaranteed
- thread_count > 1 and batch_size = 1: high throughput, low latency, order of elements is not guaranteed
- thread_count > 1 and batch_size > 1: highest throughput, highest latency, order of elements is not guaranteed
Note: The optimization target StreamingData always uses a thread count of 1.
-
Parameters:
- tasks_per_batch (int) – Number of tasks for each thread in one batch (default: 32).
- task_size (int) – Number of elements each thread processes in one task (default: 10).
Get the current batch size used for parallel graph construction.
- Returns: The current batch size
- Return type: int
Build the graph. This could be run on a separate thread in an infinite loop. Call stop() to end this process.
-
Parameters:
- callback – The callback that is called after each step of the build process. A BuilderStatus is the only argument to the function. If None nothing is printed. If callback is the string “progress”, a simple progress bar is printed to stdout.
- infinite – If set to True, blocks indefinitely, until the stop() function is called. Can be used, if build() is run in a separate thread.
Stop the build process.
The build process can only be stopped between steps, not during step execution.
deglib.builder.build_from_data(data: ndarray, labels: Iterable[int] | None = None, edges_per_vertex: int = 32, capacity: int = -1, metric: Metric = Metric.L2, rng: Mt19937 | None = None, optimization_target: OptimizationTarget = OptimizationTarget.LowLID, extend_k: int = 0, extend_eps: float = 0.2, improve_k: int = 0, improve_eps: float = 0.001, max_path_length: int = 5, swap_tries: int = 0, additional_swap_tries: int = 0, callback: Callable[[BuilderStatus], None] | str | None = None) → SizeBoundedGraph
Create a new graph built from the given data using an EvenRegularGraphBuilder.
This is a convenience function that creates a graph, builder, adds all data entries, and builds the complete graph in one call. Infinite building is not supported.
-
Parameters:
- data (np.ndarray) – Feature data with shape [N, D] where N is number of samples, D is dimensionality
- labels (Iterable *[*int ] | None) – Labels for each data entry with shape [N]. If None, labels 0 to N-1 are created. If numpy array, should have dtype uint32
- edges_per_vertex (int) – Number of edges per vertex in the graph
- capacity (int) – Maximum number of vertices the graph can hold. If <= 0, defaults to data.shape[0]
- metric (Metric) – Distance metric for measuring feature similarity
- rng (Mt19937 | None) – Random number generator. If None, a new Mt19937 will be created
- optimization_target (OptimizationTarget) – Optimization strategy based on data distribution characteristics
- extend_k (int) – Number of neighbors to consider when extending the graph
- extend_eps (float) – Epsilon value for neighbor search during graph extension
- improve_k (int) – Number of neighbors to consider when improving the graph
- improve_eps (float) – Epsilon value for neighbor search during graph improvement
- max_path_length (int) – Maximum number of edge swaps in a single improvement attempt
- swap_tries (int) – Number of improvement attempts per build step
- additional_swap_tries (int) – Additional improvement attempts after a successful improvement
- callback (Callable [ *[*deglib_cpp.BuilderStatus ] , None ] | str | None) – Callback function for build progress reporting. If “progress”, shows progress bar
- Returns: The constructed and optimized graph
- Return type: SizeBoundedGraph