BLOOMTranslation site

5mos agoupdate 17 00

BLOOM is an open-source, large-scale, multilingual generative AI model that supports 46 natural languages and 13 programming languages, aiming to promote freedom and openness in AI research.

Location:
France
Language:
ar,bg,de,el,en,es,fr,hi,it,ja
Collection time:
2025-07-26

BLOOM: Initiating Large-Scale...Multilingual open-source AI training modelsThe new eraBLOOM is a generative large-scale language model jointly developed by hundreds of AI researchers worldwide. It features a massive number of parameters, multilingual coverage, and openness, supporting 46 natural languages and 13 programming languages. The release of BLOOM symbolizes freedom and openness in AI research, and its powerful generative capabilities and applicability have garnered significant attention in the industry.

BLOOM's main functions

BLOOM is aAutoregressive Generative Large Language ModelEmploying a transformer architecture, it boasts a staggering 176 billion parameters and supports 46 natural languages and 13 programming languages. This AI training model was developed and trained by the BigScience Workshop on the Jean Zay supercomputer in France, aiming to promote a transparent, reusable, and open AI research ecosystem. Its advantages include:

  • Multilingual supportIt covers English, French, Chinese, Hindi, Arabic, and other languages.
  • Powerful generation capabilitiesIt can generate coherent, human-like text based on user prompts.
  • Downstream task migrationIt is easy to fine-tune for NLP tasks such as summarizing, question answering, translation, and information extraction.
  • Programming language compatibilityIt performs well in mainstream programming languages such as Python, Java, and C++.
  • Fully open source/downloadableAnyone can accessHugging FaceGet it for free and deploy it.
Screenshot from BLOOM's official website
Photo/Screenshot from BLOOM's official website

For example, BLOOM can easily achieve the following functions (Source:Function page link):

Functional typeillustrate
Text generationContinue writing, dialogue, short essay creation
Abstract/Information ExtractionAutomatically generate text summaries and extract key information
Code completionCode completion and generation for multiple programming languages
Semantic understandingIn some formats, it can handle reading comprehension and question answering.
Multilingual translationSupports multilingual translation (not for professional MTs, but can be used for demos and experiments).

BLOOM's Data Diversity Statistics

BLOOM used a highly diverse corpus during its AI training process, specifically including the following table:

BLOOM文本生成示例
Image/BLOOM text generation example
AI role-playing advertising banner

Chat endlessly with AI characters and start your own story.

Interact with a vast array of 2D and 3D characters and experience truly unlimited AI role-playing dialogue. Join now! New users receive 6000 points upon login!

Language or typeQuantity/Proportion
Natural Language46
programming language13
Preprocessed text size1.6 TB
Number of training tokens350 billion (350B)
Supported maximum text length2048 Token

For more model details, please seeOfficial Hugging Face documentation

官方问答页面
Image/Official Q&A Page

BLOOM's pricing and plans

As an open-source model, BLOOM's basic model is completely free, and anyone can download and deploy it locally via Hugging Face without paying any licensing fees.

The release of BLOOM follows the BigScience RAIL license, allowing individuals, research institutions, and social groups to use and modify it free of charge, but it must be explicitly prohibited from use in scenarios that violate ethics and laws. If cloud-based inference services, customized APIs, or enterprise-grade deployments are used, Hugging Face may offer separate paid options, but these are value-added services provided by cloud vendors and platform providers and do not conflict with the open-source nature of the BLOOM model itself.

Version/Service CategoryPrice/LicenseAccess methods
BLOOM full modelFree, RAIL open source licenseOfficial website
API Cloud InferenceBased on Hugging Face pricesAPI Page
Local deploymentfreeHardware resources must be provided by the user.

For more pricing and deployment details, please visit [website address].Hugging Face pricing page

模型训练数据多样性统计图
Figure / Statistical chart of diversity in model training data

How to use BLOOM

BLOOM is designed to be "useful out of the box," supporting multi-platform and multi-framework calls. Developers can use it in the following ways:

  1. Direct download weights and tokenizersIt can be loaded and used locally using PyTorch/Transformers.
  2. Direct cloud-based inference via the Hugging Face API (registration and API key required).
  3. It supports fine-tuning/transfer learning to meet specific business needs.

Quick Use Example(See)Official Quick Start Guide):

from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("bigscience/bloom") model = AutoModelForCausalLM.from_pretrained("bigscience/bloom") prompt = "Please briefly introduce the main functions of the BLOOM model." inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=100) print(tokenizer.decode(outputs[0], skip_special_tokens=True))

If you only need a small-scale trial, there are also interactive web demos available in Hugging Face Spaces.

Paid plans
Image/Paid Plan

Hardware Requirements Specification Table:

BLOOM parameter scaleBest Hardware Recommendations
176B Parameter Full VersionMultiple A100 GPUs/Enterprise Servers
Lightweight versions such as 7B/3B/1BA single high-end GPU is sufficient

Who is BLOOM suitable for?

BLOOM is positioned as "open source, cutting-edge technology", and is therefore suitable for the following groups:

  • Academic researchers and university faculty and studentsIt can be used for NLP research, result reproduction, model fine-tuning, etc.
  • AI developers and engineersIntegrate into product prototypes and conduct AI capability verification.
  • Multilingual application developers: Serving multinational corporations or linguistically diverse user groups.
  • Data ScientistUsed for specific registers, domain knowledge extraction, and custom tasks.
  • Open source community contributorsThis includes model optimization, evaluation, and development of supporting tools.
  • Programming Education and Automation Tool DevelopersExperiment with AI code generation/completion functionality.

Application Instance Scope Reference Table:

Application typeDemonstrative value
Multilingual document generation/summarizationAutomatic synthesis of multilingual information
Question answering and chatbotsBuild an assistant that supports multiple languages
Code understanding and completionSupport for subject-specific programming assistance
Cross-language content creationGlobal User Content Automation
Low-resource language researchPromoting the protection of language diversity
Official documentation
Photo/Official documentation

For detailed information on suitable users and operating suggestions, please refer to [link/reference].Official documentation

Highlights of BLOOM Technical Architecture and AI Training Model

Model architecture features

  • It adopts a decoder-only structure, similar to GPT-3.However, it covers more languages and has better transfer and generalization capabilities.
  • The number of parameters is up to 176 bytes, and it supports sequence lengths of up to 2048 tokens.The aim is to achieve a wider range of semantic understanding and generation.
Summary of architecture parametersConfiguration/Instructions
number of floors70
Number of attention heads112
Hidden layer dimensions14336
vocabulary size250,680

refer to:More technical details

Diversity and fairness of AI training models

  • Data coverage is extensiveIt includes 46 natural languages, 13 programming languages, and 1.6TB of high-quality text.
  • Strong diversity design principleIt emphasizes proportional sampling of low-resource languages and stresses "open source, openness, and inclusivity".
  • Model versions varyIn addition to the full version with 176B parameters, lightweight versions such as 7B1 and 3B are also available for users with limited resources.

BLOOM的风险、局限与使用建议

限制与风险需正视:

  • 非高风险决策工具:模型内容“看似靠谱但真实准确性需核查”,不适用于生物医疗、金融、法律等场景直接决策。
  • 可能输出有害内容:如带偏见、攻击性、敏感词汇等。
  • 需严守伦理和数据合规:遵循RAIL协议,不得违规滥用。
主要风险类型具体说明
观点偏倚/数据不均衡部分群体信息出现频率不同
个人信息泄露训练数据中或有敏感内容
错误信息产生生成内容非100%事实
不当领域使用禁止自动评测个体、关键判决场景
风险与限制说明文档
Photo/风险与限制说明文档

See details:风险与限制说明文档

BLOOM常见问题

BLOOM模型有什么不同版本,它们如何选择?

BLOOM提供了从微小型(bloom-560m)到超大规模(bloom-176B)多种参数级别版本。

  • 硬件资源有限推荐选择7B、3B等轻量版
  • 科研及高性能需求可选用176B全量版,但需分布式多卡部署。

详细版本一览表请见BLOOM模型列表

BLOOM可以用在商业产品吗?

根据开源RAIL协议,BLOOM基本可用于商业应用(只要不违法、不用于高风险/违规场景),但建议详细阅读许可协议,确保不违反附加条款。如涉及云端API商业调用,还需按照Hugging Face平台额外条款付费。

BLOOM能否自定义微调?对自有数据好用吗?

BLOOM设计为可迁移/微调的AI训练模型,开发团队和社区已给出多种微调实操方案。基于公开Transformers工具包,开发者可在自有数据集上快速适配BLOOM用于分类、标注、生成等下游任务。

微调教程/实战:可参考Official documentation及社区分享。

End

BLOOM已成为推动NLP民主化、AI开放协作的“里程碑”,其多语种能力与开放生态为全球开发者和AI训练模型爱好者创造了前所未有的创新土壤。无论是科研实验、语言多样性保护,还是智能产品原型开发,BLOOM都为你准备了灵活、专业、开放、高性能的AI新范式。如果你有兴趣体验尖端AI的力量,不妨即刻访问BLOOM官方文档开启你的探索之旅,共同推动AI科技的繁荣盛开。

AI role-playing advertising banner

Chat endlessly with AI characters and start your own story.

Interact with a vast array of 2D and 3D characters and experience truly unlimited AI role-playing dialogue. Join now! New users receive 6000 points upon login!

data statistics

Data evaluation

BLOOMThe number of visitors has reached 17. If you need to check the site's ranking information, you can click ""5118 Data""Aizhan Data""Chinaz data""Based on current website data, we recommend using Aizhan data as a reference. More website value assessment factors include:"BLOOMAccess speed, search engine indexing and volume, user experience, etc.; of course, to evaluate the value of a website, the most important thing is to base it on your own needs and requirements, and some specific data will need to be obtained from [research institutions/resources].BLOOMWe will negotiate with the website owner to provide information such as the website's IP addresses, page views (PV), and bounce rate.

aboutBLOOMSpecial Announcement

This site's AI-powered navigation is provided by Miao.BLOOMAll external links originate from the internet, and their accuracy and completeness are not guaranteed. Furthermore, AI Miao Navigation does not have actual control over the content of these external links. As of 12:02 PM on July 26, 2025, the content on this webpage was compliant and legal. If any content on the webpage becomes illegal in the future, please contact the website administrator directly for deletion. AI Miao Navigation assumes no responsibility.

Relevant Navigation

No comments

none
No comments...