Python coding interview practice problems.
Curated collection of Chinese stable diffusion base models.
Python learning notes, recipes, and cheatsheets.
Practical guide for building high-resolution remote sensing image datasets.
Wav2lip-based digital human training with lip-sync driving (96-288px).
E-commerce LLM fine-tuned based on Qwen1.5 and LLaMA3.
Qwen3 series e-commerce LLM fine-tuned with e-commerce data.
Qwen2.5 series e-commerce LLM fine-tuned with e-commerce data.
E-commerce LLM fine-tuned based on Qwen1.5 and LLaMA3.
Mini LLaMA3 covering full pipeline from data, tokenizer, PT, SFT to RLHF.
Fine-tune DeepSeek-R1 on medical data.
Chinese GPT2-like pretrained model trained on e-commerce data.
Unified LLM, multimodal, FLUX generation interface with FastAPI deployment.
Auto-monitoring, LLM rewriting, and auto-publishing agent application.
X-ray multimodal model fine-tuned on LLaVA 1.6 with 4 V100 GPUs.
X-ray multimodal model fine-tuned on QwenVL-Chat.
X-ray multimodal model fine-tuned on Qwen2-VL-7B-Instruct.
X-ray multimodal model fine-tuned on LLaMA3.2-Vision on 4 A800 GPUs.
OCR detection multimodal model fine-tuned on InternVL2-8B.
OCR VQA multimodal model fine-tuned on InternVL2-8B.
OCR text detection multimodal model based on PaliGemma.
Large-scale OCR benchmark for multimodal LLMs in e-commerce.
Collection of Chinese stable diffusion base models.
ID-Customization for character consistency generation on Flux and SD.
First real-time Flux-based sketch-to-image generation model.
ControlNet conditioned on masks, trained on e-commerce cutout data.
Image editing based on Flux ACE++ for character consistency editing.
WebUI-based ChatDiT, supports generating images through conversations.
Stable diffusion models for e-commerce image generation and inpainting.
ChineseCLIP fine-tuned on home decoration and furniture data.
DALL-E 1 model for Chinese home decoration scenes.
Additional preprocessors for ControlNet auxiliary library.
Wav2lip-based digital human training with lip-sync driving (96-288px).
Training set for 2D talking face projects (wav2lip, geneface++).
VideoClip, a video editing application.
Dataset for creating e-commerce animations.
Collection of Chinese video generation models.
Unified training framework for image and video generation models.
Additional detection algorithms (EfficientDet, YOLOv4/v5) for mmdetection.
Additional classification algorithms (GhostNet, etc.) for mmcls.
OCR algorithms organized in mm framework.
GAN and traditional image generation algorithms.
Text rendering reorganized in mm format.
Camera photo blur detection with FastDeploy multi-platform deployment.
Answer sheet intelligent grading system.
Data augmentation for object detection and segmentation.
RGB to CMYK conversion for offline print materials.
Curated visual AI projects on HuggingFace, ModelScope, and PaddleHub.
Python coding interview practice problems.
Python learning notes.
Guide for creating high-resolution remote sensing image datasets.
LeetCode Hot 100 problems in Python.
Common data processing methods for deep learning.
Paper reading notes on deep learning, remote sensing, OCR, and generation.
Parking spot finder mobile application.
Deep learning framework implemented with NumPy (TF static graph + PyTorch dynamic).
Hyperspectral classification models in mm framework.
Simple example to understand mmcv internals.