HF Mirror
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
HF Mirror PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Open to Collab
90.0
TFLOPS
88
17
284
s3nh
PRO
s3nh
Follow
Warung's profile picture
tobiashomie's profile picture
merve's profile picture
256 followers
·
115 following
s3nhxx
s3nh
AI & ML interests
Quantization, LLMs, Deep Learning for good. Follow me if you like my work. Patreon.com/s3nh
Recent Activity
reacted
to
fblgit
's
post
with 👍
about 15 hours ago
Introducing `HarEmb - PII` a single-transformer-block distilled layer from OpenMed PII Privacy filter. Its a very tiny model that reaches comparable results at PII classification thru viterbi BIOES decoding, harnessing 98%~ the original model performance while being a tiny fraction of the base model. It doubles the performance tk/s, reduces the active params dramatically and the VRAM footprint. The evaluation & benchmarking is within the model repository and can be reproduced. I trained it with an RTX4090 without issues and it is compatible with OpenMed suite and a in-place replacement for openai privacy-filter model. https://huggingface.co/fblgit/haremb-privacy-filter-opennemo I'm looking for people who wants to co-author/contribute/endorse HarEmb research and the technical paper for the model. Contact xavi@juanako.ai
reacted
to
lbourdois
's
post
with 🤗
about 15 hours ago
New blog post! An introduction to a little-known but highly effective model reduction method: 𝗧𝗿𝗶𝗺𝗺𝗶𝗻𝗴✂️ We show how to reduce model size (we went up to 87.24% reduction) while preserving its performance. We applied this technique to 16 different model families across several modalities to illustrate that it works on any architecture (as long as the embedding layer is the last one of the model) and on any modality involving text. From these 16 families, we generated over 𝟱,𝟱𝟬𝟬 𝗺𝗼𝗻𝗼𝗹𝗶𝗻𝗴𝘂𝗮𝗹 𝗺𝗼𝗱𝗲𝗹𝘀 𝗶𝗻 𝟭𝟮𝟰 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲𝘀 🌍 Key takeaways from our experiments: 1️⃣ Trimming does not require a GPU. Our models were obtained on a CPU. 2️⃣ This method scales up to at least 4B parameters (we did not test beyond that). 3️⃣ Trimmed model is smaller than the original while preserving its performance. If you observe a slight performance drop, just fine-tuned to recover or even surpass the original performance. 4️⃣ For an equivalent compute budget, it is better to trim then fine-tune rather than fine-tuning the original model. Since the model is smaller, you can run more epochs/show more data and get in fine a better model than the original. 5️⃣ Trimming is a competitive alternative to distillation and quantization. E.g. we obtained our alternative to DistilBERT in 9 minutes on CPU vs. 90 hours of GPU for the latter. 6️⃣ Trimming could generate reasoning traces in the language of the trimmed model. This could be an alternative to generating traces in English and then translating them into the desired language. And many other things (such as how much data are needed, the impact of the database used, the order in which it should be done, etc.) are available in the blogpost! Blogpost: https://huggingface.co/blog/lbourdois/introduction-to-trimming Models: https://huggingface.co/spaces/alphaedge-ai/Trimming_models_search
reacted
to
appvoid
's
post
with 👀
2 days ago
yikes! i missed the small model hackathon i guess i'll have to make sota for people to notice
View all activity
Organizations
s3nh
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
New activity in
raincandy-u/Rain-100M
5 months ago
Hardware used
1
#2 opened 5 months ago by
s3nh
New activity in
s3nh/Mistral_Sonyichi-7B-slerp
12 months ago
License Compatibility
1
#3 opened 12 months ago by
qiuqiu666
New activity in
city96/Wan2.1-FLF2V-14B-720P-gguf
about 1 year ago
Pipeline
#2 opened about 1 year ago by
s3nh
New activity in
Lajonbot/pythia-160m-33k-steps-self-instruct-polish
over 1 year ago
Adding `safetensors` variant of this model
#1 opened over 1 year ago by
SFconvertbot
New activity in
SmolTuners/README
over 1 year ago
Gh organization
8
#3 opened over 1 year ago by
s3nh
Datasets
3
#1 opened over 1 year ago by
s3nh
New activity in
s3nh/Mistral_Sonyichi-7B-slerp
over 2 years ago
Adding Evaluation Results
#2 opened over 2 years ago by
leaderboard-pr-bot
New activity in
s3nh/Severusectum-7B-DPO
over 2 years ago
Adding Evaluation Results
#1 opened over 2 years ago by
leaderboard-pr-bot
New activity in
s3nh/Noromaid-Aeryth-7B
over 2 years ago
Adding Evaluation Results
#1 opened over 2 years ago by
leaderboard-pr-bot
New activity in
s3nh/SeverusWestLake-7B-DPO
over 2 years ago
Adding Evaluation Results
#1 opened over 2 years ago by
leaderboard-pr-bot
New activity in
zero-gpu-explorers/README
over 2 years ago
RuntimeError: CUDA must not be initialized in the main process on Spaces with Stateless GPU environment.
❤️
1
2
#6 opened over 2 years ago by
mrfakename
New activity in
s3nh/MiniCPM-2B-dpo-fp32-GGUF
over 2 years ago
error loading model: create_tensor: tensor 'output.weight' not found ?
6
#1 opened over 2 years ago by
wukongai
New activity in
s3nh/OEvortex-HelpingAI-GGUF
over 2 years ago
Hello bro
12
#1 opened over 2 years ago by
Abhaykoul
New activity in
s3nh/NousHermes-Kunoichi-SolarMaid-7b
over 2 years ago
Librarian Bot: Add base_model metadata to model
#2 opened over 2 years ago by
librarian-bot
Librarian Bot: Add merge tag to model
#1 opened over 2 years ago by
librarian-bot
New activity in
s3nh/SnowLotus-v2-10.7B-GGUF
over 2 years ago
Problem with Quants
5
#2 opened over 2 years ago by
Clevyby
New activity in
s3nh/Blurred-Beagle-7b-slerp-GGUF
over 2 years ago
ty<3
❤️
1
1
#1 opened over 2 years ago by
gate369
New activity in
s3nh/SnowLotus-v2-10.7B-GGUF
over 2 years ago
Thanks :)
❤️
1
1
#1 opened over 2 years ago by
BlueNipples
New activity in
social-post-explorers/README
over 2 years ago
not able to post after exceed message length once
1
#19 opened over 2 years ago by
s3nh
New activity in
s3nh/Noromaid-Aeryth-7B-GGUF
over 2 years ago
Deleted GGUF?
6
#1 opened over 2 years ago by
EloyOn
Load more