Introduces the v2.0.0 release of TinyGPU #1
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request introduces the v2.0.0 release of TinyGPU, featuring significant enhancements to the instruction set, visualization, continuous integration, and documentation. The release adds new shared memory instructions, improves synchronization semantics, expands the visualizer, and updates the project structure and examples to better support educational use and extensibility.
Major new features and improvements:
Instruction Set and Core Functionality:
SHLDandSHSTfor robust per-block shared memory operations, and improvedSYNC/SYNCBsemantics for better thread and block coordination. [1] [2] [3]Visualizer and Example Scripts:
run_odd_even_sort.py,run_reduce_sum.py,run_sync_test.py, newrun_block_shared_sum.py) to output GIFs tosrc/outputs/<script_name>/and included new example programs, such as block shared memory sum and a REPL debugger. [1] [2] [3] [4] [5] [6] [7] [8] [9]Continuous Integration and Code Quality:
devbranch and Python 3.13, and added badges for linting, code style, and tests. Integratedruffandblackfor linting and formatting. [1] [2] [3]Documentation and Project Structure:
README.mdand addeddocs/index.mdfor v2.0.0, with detailed changelogs, updated examples, project layout, and instruction set reference. Updated all documentation and image paths to reflect the newsrc/outputs/organization. [1] [2] [3] [4] [5]New and Improved Examples:
block_shared_sum.tgpuand runner), and an interactive REPL debugger (debug_repl.py). [1] [2]These changes collectively make TinyGPU more powerful, extensible, and user-friendly for both educational and development purposes.
SHLDandSHSTinstructions for shared memory, improvedSYNCsemantics, and refactored core execution for extensibility and performance. [1] [2] [3]src/outputs/, and added new examples for block shared memory and a REPL debugger. [1] [2] [3] [4] [5] [6] [7] [8] [9]devbranch, and added badges for linting and tests. Integratedruffandblackfor code quality. [1] [2] [3]README.mdand newdocs/index.mdwith v2.0.0 changelog, usage instructions, new project layout, and instruction set reference. [1] [2] [3] [4] [5]block_shared_sum.tgpu,run_block_shared_sum.py, anddebug_repl.pyfor demonstrating block-level operations and interactive debugging. [1] [2]