Show HN: Apex-1-flash, 4B LLM finetuned on RTX 5070
2 points
2 hours ago
| 0 comments
| huggingface.co
| HN
The goal was to create a highly efficient, small-scale model that can perform reasoning tasks while remaining lightweight enough to run easily on consumer hardware. Technical Stack: Base: Qwen3:4B Training: Fine-tuned using Unsloth for memory efficiency, which allowed me to run the process smoothly on an RTX 5070. Stack: Built with cu128, PyTorch, and Hugging Face Transformers. Dataset: Trained on Raymond-dev-546730/Open-CoT-Reasoning-Mini to improve Chain-of-Thought (CoT) capabilities.
No one has commented on this post.