Qwen3 series e-commerce LLM fine-tuned with e-commerce data.
Qwen2.5 series e-commerce LLM fine-tuned with e-commerce data.
E-commerce LLM fine-tuned based on Qwen1.5 and LLaMA3.
Mini LLaMA3 covering full pipeline from data, tokenizer, PT, SFT to RLHF.
Fine-tune DeepSeek-R1 on medical data.
Chinese GPT2-like pretrained model trained on e-commerce data.
Unified LLM, multimodal, FLUX generation interface with FastAPI deployment.
Auto-monitoring, LLM rewriting, and auto-publishing agent application.
X-ray multimodal model fine-tuned on LLaVA 1.6 with 4 V100 GPUs.
X-ray multimodal model fine-tuned on QwenVL-Chat.
X-ray multimodal model fine-tuned on Qwen2-VL-7B-Instruct.
X-ray multimodal model fine-tuned on LLaMA3.2-Vision on 4 A800 GPUs.
OCR detection multimodal model fine-tuned on InternVL2-8B.
OCR VQA multimodal model fine-tuned on InternVL2-8B.
OCR text detection multimodal model based on PaliGemma.
Large-scale OCR benchmark for multimodal LLMs in e-commerce.
Collection of Chinese stable diffusion base models.
ID-Customization for character consistency generation on Flux and SD.
First real-time Flux-based sketch-to-image generation model.
ControlNet conditioned on masks, trained on e-commerce cutout data.
Image editing based on Flux ACP++ for character consistency editing.
WebUI-based ChatDiT, supports generating images through conversations.
DVI: Training-free personalized generation via disentangling semantic and visual identity.
Training-free flexible identity injection for text-to-image generation.
Stable diffusion models for e-commerce image generation and inpainting.
ChineseCLIP fine-tuned on home decoration and furniture data.
DALL-E 1 model for Chinese home decoration scenes.
Additional preprocessors for ControlNet auxiliary library.
Intelligent banner design framework balancing creative freedom and design rules.
Wav2lip-based digital human training with lip-sync driving (96-288px).
Training set for 2D talking face projects (wav2lip, geneface++).
VideoClip, a video editing application.
Dataset for creating e-commerce animations.
Collection of Chinese video generation models.
Unified training framework for image and video generation models.
Character identity consistency in text-to-image with minimal data.
Image evaluation system for EditID.
Enhanced ParaAttention for DiT inference with context parallelism.
Unified consumer-GPU multi-GPU inference framework for image and video generation.
Multi-level collaborative acceleration framework for distributed diffusion inference.
Self-supervised text erasing with controllable image synthesis.
Learn-to-rank framework for dynamic creative optimization.
Additional detection algorithms (EfficientDet, YOLOv4/v5) for mmdetection.
Additional classification algorithms (GhostNet, etc.) for mmcls.
OCR algorithms organized in mm framework.
GAN and traditional image generation algorithms.
Text rendering reorganized in mm format.
Camera photo blur detection with FastDeploy multi-platform deployment.
Answer sheet intelligent grading system.
Data augmentation for object detection and segmentation.
RGB to CMYK conversion for offline print materials.
Curated visual AI projects on HuggingFace, ModelScope, and PaddleHub.
Guide for creating high-resolution remote sensing image datasets.
3D densely connected convolutional network for HSI classification.
Faster HSI classification based on selective kernel mechanism.
Multi-scale dense networks for hyperspectral image classification.
Dynamic group convolution network for HSI classification.
Hybrid Mamba-Transformer vision backbone for HSI classification.
Hyperspectral classification models in mm framework.
Spatial-geometry enhanced 3D dynamic snake CNN for HSI classification.
Efficient dynamic attention 3D convolution for HSI classification.
3D wavelet convolutions with extended receptive fields for HSI.
Dynamic 3D KAN convolution with adaptive grid for HSI classification.
Expert kernel generation network for HSI classification.
Transformer-based spectral-spatial attention decoupling for HSI.
Python coding interview practice problems.
Python learning notes.
LeetCode Hot 100 problems in Python.
Common data processing methods for deep learning.
Paper reading notes on deep learning, remote sensing, OCR, and generation.
Deep learning framework implemented with NumPy (TF static graph + PyTorch dynamic).
Simple example to understand mmcv internals.
Parking spot finder mobile application.