-
Notifications
You must be signed in to change notification settings - Fork 0
[Build] Use CMAKE_ARGS to control optional features #24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @haochengxia, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request introduces a significant overhaul to the project's build system and documentation. The primary goal is to provide more granular control over optional features during compilation using CMAKE_ARGS
, while simultaneously restructuring and enhancing the user documentation for improved clarity and ease of use. This includes making optional features default to off for source builds but ensuring they are enabled in pre-built wheels, alongside a major reorganization of the documentation content and navigation.
Highlights
- Build System Refinement: Optional cache algorithms (
GLCACHE
,LRB
,3L_CACHE
) are now disabled by default inCMakeLists.txt
and can be explicitly enabled usingCMAKE_ARGS
during compilation, providing more granular control for source builds. - Pre-built Wheel Configuration: The
pyproject.toml
configuration forcibuildwheel
has been updated to ensure that pre-built Python wheels include all optional features enabled by default, simplifying installation for most users who rely on pre-compiled binaries. - Documentation Overhaul: The project's documentation structure has been completely reorganized, introducing new top-level sections like 'User Guide' and 'Developer Guide,' and breaking down 'Getting Started' and 'Examples' into more detailed sub-sections for improved clarity and navigation.
- Enhanced Quickstart Guide: A comprehensive new quickstart guide has been added, covering installation (including
uv
andCMAKE_ARGS
usage), cache simulation, trace analysis, and the plugin system, with practical code examples to help users get started quickly. - New Test Coverage: Dedicated tests for the optional
GLCache
,LRB
, and3LCache
algorithms have been added, ensuring their functionality when enabled in the build.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the build system to make optional features (GLCache, LRB, 3LCache) disabled by default and controllable via CMAKE_ARGS
. This is a solid improvement for users building from source, as it reduces the default set of dependencies. The documentation is significantly restructured and improved to guide users through the new installation process for optional features. The error messages for missing optional features are also much more helpful.
My main concern is the removal of the API reference documentation, which is a critical resource for users. I've also provided a suggestion to improve the clarity of the error messages for when optional features are not installed.
self.bytes_read = 0 | ||
|
||
@property | ||
def hit_ratio(self) -> float: | ||
"""Calculate hit ratio.""" | ||
total = self.hits + self.misses | ||
return self.hits / total if total > 0 else 0.0 | ||
|
||
@property | ||
def miss_ratio(self) -> float: | ||
"""Calculate miss ratio.""" | ||
return 1.0 - self.hit_ratio | ||
``` | ||
|
||
## Error Handling | ||
|
||
The library uses standard Python exceptions: | ||
|
||
- `ValueError`: Invalid parameters or configuration | ||
- `FileNotFoundError`: Trace file not found | ||
- `RuntimeError`: Runtime errors from underlying C++ library | ||
- `MemoryError`: Out of memory conditions | ||
|
||
Example error handling: | ||
|
||
```python | ||
try: | ||
reader = lcs.TraceReader("nonexistent.csv", lcs.TraceType.CSV_TRACE) | ||
except FileNotFoundError: | ||
print("Trace file not found") | ||
except ValueError as e: | ||
print(f"Invalid configuration: {e}") | ||
``` | ||
|
||
## Configuration Options | ||
|
||
### Reader Configuration | ||
|
||
```python | ||
reader_params = lcs.ReaderInitParam( | ||
has_header=True, # CSV has header row | ||
delimiter=",", # Field delimiter | ||
obj_id_is_num=True, # Object IDs are numeric | ||
ignore_obj_size=False, # Don't ignore object sizes | ||
ignore_size_zero_req=True, # Ignore zero-size requests | ||
cap_at_n_req=1000000, # Limit number of requests | ||
block_size=4096, # Block size for block-based traces | ||
trace_start_offset=0, # Skip initial requests | ||
) | ||
|
||
# Field mappings (1-indexed) | ||
reader_params.time_field = 1 | ||
reader_params.obj_id_field = 2 | ||
reader_params.obj_size_field = 3 | ||
reader_params.op_field = 4 | ||
``` | ||
|
||
### Sampling Configuration | ||
|
||
```python | ||
sampler = lcs.Sampler( | ||
sample_ratio=0.1, # Sample 10% of requests | ||
type=lcs.SamplerType.SPATIAL_SAMPLER # Spatial sampling | ||
) | ||
reader_params.sampler = sampler | ||
``` | ||
|
||
## Thread Safety | ||
|
||
The library provides thread-safe operations for most use cases: | ||
|
||
- Cache operations are thread-safe within a single cache instance | ||
- Multiple readers can be used concurrently | ||
- Analysis operations can utilize multiple threads | ||
|
||
For high-concurrency scenarios, consider using separate cache instances per thread. | ||
|
||
## Memory Management | ||
|
||
The library automatically manages memory for most operations: | ||
|
||
- Cache objects handle their own memory allocation | ||
- Trace readers manage buffering automatically | ||
- Request objects are lightweight and reusable | ||
|
||
For large-scale simulations, monitor memory usage and consider: | ||
|
||
- Using sampling to reduce trace size | ||
- Processing traces in chunks | ||
- Limiting cache sizes appropriately | ||
|
||
## Best Practices | ||
|
||
1. **Use appropriate cache sizes**: Size caches based on your simulation goals | ||
2. **Set random seeds**: For reproducible results in synthetic traces | ||
3. **Handle errors**: Always wrap file operations in try-catch blocks | ||
4. **Monitor memory**: For large traces, consider sampling or chunking | ||
5. **Use threading**: Leverage multi-threading for analysis tasks | ||
6. **Validate traces**: Check trace format and content before simulation | ||
[TBD] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API reference documentation has been removed and replaced with "[TBD]". This is a significant documentation regression. While the rest of the documentation has been greatly improved, removing the API reference will make it harder for users to understand and use the library's features. If the plan is to auto-generate this documentation, it would be best to include that in this PR or a subsequent one soon. If not, please restore the previous content and update it to reflect the changes in this PR.
from .libcachesim_python import ThreeLCache_init | ||
except ImportError: | ||
raise ImportError("ThreeLCache is not installed. Please install it with `pip install libcachesim[all]`") | ||
raise ImportError( | ||
'ThreeLCache is not installed. Please install it with `CMAKE_ARGS="-DENABLE_3L_CACHE=ON" pip install libcachesim --force-reinstall`' | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new error message is a great improvement as it provides clear, actionable instructions for the user. To make it even more robust, consider making the message more explanatory about why the import fails and less tied to a specific package manager (pip
). This would also be a good opportunity to suggest using --no-cache-dir
to avoid issues with cached wheels.
A similar improvement can be applied to the error messages for GLCache
and LRB
.
Here's a suggested improvement:
from .libcachesim_python import ThreeLCache_init | |
except ImportError: | |
raise ImportError("ThreeLCache is not installed. Please install it with `pip install libcachesim[all]`") | |
raise ImportError( | |
'ThreeLCache is not installed. Please install it with `CMAKE_ARGS="-DENABLE_3L_CACHE=ON" pip install libcachesim --force-reinstall`' | |
) | |
raise ImportError( | |
'ThreeLCache is not available. This optional feature must be enabled at build time. ' | |
'Please reinstall with the `ENABLE_3L_CACHE` CMake option set to ON. ' | |
'For example: `CMAKE_ARGS="-DENABLE_3L_CACHE=ON" pip install --force-reinstall --no-cache-dir libcachesim`' | |
) |
Uh oh!
There was an error while loading. Please reload this page.