Code

AIGC

  • Summary
  • LLM
    • EcommerceLLM :E-commerce scene LLM fine-tuned based on qwen1.5 and llama3.
    • EcommerceLLMQwen2.5 :A Qwen2.5 series e-commerce large language model fine-tuned on e-commerce data.
    • MiniLLaMA3 :A mini version of Llama 3, covering the entire pipeline from data construction (0-1), tokenizer training, pre-training (PT), supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF).
    • ECOMCPM :Language model trained on e-commerce data from ‘What’s Worth Buying’ website, a Chinese pretrained model similar to GPT2.
  • LMM
    • XrayLLaVA :Xray Large Multi-modal Model, fine-tuned on LLaVA for the Xray’s multi-modal large model, using 4 V100 GPUs based on the llava1d6-mistral-7b-instruct model. LLaVA is among the most popular model methodologies and architectures in large multi-modal language models. Fine-tuning LLaVA helps us evaluate and compare the potential of training large multi-modal language models in vertical scenarios.
    • XrayQwenVL : Xray Large Multi-modal Model, fine-tuned on QwenVL for Xray’s multi-modal large model, using 4 V100 GPUs based on the qwenvl-chat model for fine-tuning.
    • XrayQwen2VL :Fine-tuned on Qwen2vl using the Xray open-source dataset. The training LORA weights have been released for academic research. For inference, the original qwen2-vl-7b-instruct weights need to be loaded separately, and the LORA weights can be merged using llamafactory’s merge LORA function. Llamafactory 0.9.0 was used for fine-tuning in this experiment.
    • EcommerceOCRBench :A larger-scale OCR benchmark dataset for multimodal large language models in e-commerce, modeled after OCRBench.
    • OCRPaliGemma :A multimodal large language model with a focus on OCR text detection.
  • SD
    • HOME-CLIP :ChineseCLIP model was fine-tuned on home decoration and furniture data crawled from Visual China.
    • HOME-DALLE1 :DALL-E 1 model for Chinese home decoration and furniture scenes.
    • controlnet_aux_add :Auxiliary functions of ControlNet, an additional library of huggingface’s ControlNet aux, adding preprocessors not present in aux.
    • EcommerceSD :A focus on image generation in e-commerce scenarios, including model generation and inpainting.
    • MaskControlnet :A ControlNet-based generative model conditioned on masks, trained on a massive dataset of e-commerce cutout images (saliency map detection data).
  • Video generation
  • Digital Human
    • Wav2lipAll :Training a virtual digital human based on wav2lip, with lip shape driving, including data processing procedures, etc. The model includes sizes 96x96, 192x192, 192x288, 288x288.
    • TalkingFace :Training set for 2D virtual digital human projects similar to wav2lip, geneface++.

Comfyui-extension and Stable-diffusion-webui-extension

CV and Creatives

  • CV
    • Camera_blur_detection :Perform region detection on photos captured by the camera and provide a blur determination, C++ code, using FastDeploy for multi-platform deployment, VS2019.
    • mmdetection_add :Add the implemented object detection algorithms, including EfficientDet, YOLOv4/v5, etc.
    • mmclassification_add :Add the implemented classification algorithms to mmcls, including GhostNet, etc.
    • Answer_card_identification :Answer Sheet Project, intelligent grading.
    • mmsynth :Reorganized text_render in mm format.
  • Creatives
    • TextErasing :Text erasing algorithm, Alibaba’s self-supervised text erasing with controllable image synthesis algorithm, will provide two versions.
    • AllRank :Learn-to-rank framework, the re-ranking module in recall/coarse ranking/fine ranking/re-ranking, previously mainly used for dynamic creative optimization to re-rank features including images.
    • mmgeneration_add :GAN and other traditional image generation algorithms.
    • Xiaobao :VideoClip, a video editing application.

Deployment Acceleration

  • KuaiZai :Mainly some project codes for multi-platform deployment.
  • PlateRec :License plate recognition, based on PaddleOCR, ONNX Runtime, C++.
  • Yolov5_rknnlite2 :YOLOv5 pedestrian detection, deployed on RK3588, RKNLite2.

Hyperspectral classification

Leraning

Hit Counter