Curated Resource ( ? )

BLOOM 176B — how to run a real LARGE language model in your own cloud?

Like guide , llm , bloom , cloud

Curated: 09/04/2023 from medium.com/mlearning-ai/bloom-176b-how-to-run-a-real-large-language-model-in-your-own-cloud-e5f6bdfb3bb1

my notes ( ? )

How to set up BLOOM on your own cloud.

"BigScience Large Open-science Open-access Multilingual Language Model:

free transformer-based language model created by 1000 researchers
trained on about 1,6 TB pre-processed multilingual text.
biggest BLOOM model in parameters is 176B = ~GPT-3 scale
smaller models available: 7b, 3b, 1b7
Needs 360 GB of RAM ,,, but "Microsoft has provided a downsampled variant with INT8 weights (from original FLOAT16 weights) that runs on the DeepSpeed Inference engine and uses tensor paralellism.... tensors are split into 8 shards. So ... absolute model size is reduced and ... split and parallelized and can thus be distributed over 8 GPUs."

He then provides instructions for hosting on AWS as "it provides a SageMaker setup for a Deep Learning container capable of initializing the model... [but] get the instances through support, you can’t do it by self configuring... hosted model can be loaded from the Microsoft repository on Huggingface into an S3 ... [it's] 180 GB... costs about $32 per hour when running... you can start it in about 18 min, shutting down and freeing the resources takes seconds... We put a custom API gateway and lambda function in the interface on top of the Sagemaker endpoint that allows users to connect externally with an API key".

Read the Full Post

The above notes were curated from the full post medium.com/mlearning-ai/bloom-176b-how-to-run-a-real-large-language-model-in-your-own-cloud-e5f6bdfb3bb1.

BLOOM 176B — how to run a real LARGE language model in your own cloud?

my notes ( ? )

Read the Full Post

Related reading

Cookies disclaimer