Hey ππ½, I'm cpuimage
Hi, I am ZhiHan Gao, living in Shantou, China.
I specialize in developing audio, video, and image processing algorithms, and I share my open-source projects on GitHub. If you find my projects useful, please consider buying me a coffee. Your support is greatly appreciated!
Professional Experience
- π¨π½βπ» I have worked at leading tech companies including Baidu, KingSoft, and more.
- π± Developed algorithms for multiple applications:
- π‘Delivered AI-based technical customization services and successfully implemented and delivered several AI projects.
Research Progress and Achievements
- π± Here are some of my past research endeavors and achievements in deep learning and statistical algorithms:
- Deep Learning
- [x] A Trimap-Free Solution for Real-Time Automatic Portrait Matting on Mobile Devices
- [x] ~~A Robust Optimizer With Accelerated Convergence Capability in Deep Learning~~
- [x] ~~A General and Adaptive Robust Loss Structure Scheme~~
- [x] ~~A Robust Loss Weighting Solution For Learning Long-Tail Data~~
- [x] Image Synthesis and Semantic Manipulation Using Stable Diffusion Networks
- [x] Stable Diffusion Architecture Optimization And Deployment On Mobile Devices
- [x] A Robust Solution For Accelerated Training Convergence And Learning Long-Tail Data
- [x] A Arbitrary Resolution Super Resolution Solution for Real World
- [x] Accelerate Stable Diffusion FP16 Inference Deployment Optimization with TensorRT
- [x] Port Stable Diffusion X4 Upscaler To TensorFlow And Support FP16 Inference Deployment
- [x] Port Stable Diffusion PromptGen (GPT2) To TensorFlow And Support ONNX Inference Deployment
- [x] Stable Diffusion Architectural Distillation
- [x] Content-aware 3-view synthesis based on Stable Diffusion in Game Art
- [x] Super Resolution Solution based on Stable Diffusion
- [x] Video Editing techniques based on Stable Diffusion
- [x] Port Stable Diffusion XL 1.0 To TensorFlow And Support FP16 Inference Deployment
- [x] A Plug-And-Play Algorithm For Asynchronous Inference With Frequency-Domain Decomposable Reconstruction For Arbitrary Visual Scenes
- [x] ~~Stable Diffusion Inference With PyTorch Weights And More Features Like Stable Diffusion Web UI In Keras 3.x~~
- [x] FLUX.1 Support FP16 Inference Deployment and Low Memory Lora Training In PyTorch
- [x] LLM from Scratch with PyTorch
- [x] Enhanced FaceFusion: Decoupled Modules and Optimized Inference for Visual Performance
- [x] Ultra High-Resolution Portrait Retouching
- [x] Training-Free Universal High-Resolution Synthesis for Any Video Model
- [x] Chunked Flash Attention in Keras
- [x] Robustness and Speed, Effortlessly: An Adaptive, Efficient Optimizer for Stable Training
- [x] Learning-Rate-Free
- [x] Warmup-Free
- [x] Normalization-Free
- [x] Corrected Gradient Accumulation β Large-Batch-Equivalent Performance
- [x] Long-Tailed Gradient Mitigation
- [x] Accelerated Convergence
- [x] Memory-Efficient
- [x] Loss Regularization: A Novel Approach to Enhance Model Generalization and Convergence
- [x] A Simple Yet Effective Approach to Multi-Task Learning via Dynamic Loss Weighting
- [x] A Parameter-Free Weight Regularization Approach
- [x] Towards Stable Batch Normalization via Adaptive Moving Averages
- [x] LLM Memory-Efficient Training
- [x] Mitigating Numerical Instability in Training via Scalable Parallel Compensated Reductions in PyTorch
- [x] MozzyTokenizer: Adaptive Byte-level Tokenizer via Dynamic Encoding Selection
- Statistical Algorithms
- [x] Real time and embedded implementation of speech enhancement algorithms based on Minimum Mean-Square Error Short-Time Spectral Amplitude estimation (MMSE-STSA)
Collaboration and Contact
- π― Iβm looking to collaborate on audio and image algorithms
- π¬ Any paid technical service or solution consulting