You can then use the pipeline to answer instructions: res = generate_text( "Explain to me the difference between nuclear fission and fusion.")Īlternatively, if you prefer to not use trust_remote_code=True you can download instruct_pipeline.py, Generate_text = pipeline(model= "databricks/dolly-v2-12b", torch_dtype=torch.bfloat16, trust_remote_code= True, device_map= "auto") It is also fine to remove it if there is sufficient memory. ![]() It does not appear to impact output quality. Including torch_dtype=torch.bfloat16 is generally recommended if this type is supported in order to reduce memory usage. This loads a custom InstructionTextGenerationPipelineįound in the model repo here, which is why trust_remote_code=True is required. The instruction following pipeline can be loaded using the pipeline function as shown below. To use the model with the transformers library on a machine with GPUs, first make sure you have the transformers and accelerate libraries installed. On a ~15K record instruction corpus generated by Databricks employees and released under a permissive license (CC-BY-SA) Running inference for various GPU configurations.ĭolly-v2-12b is a 12 billion parameter causal language model created by Databricks that is derived from Please refer to the dolly GitHub repo for tips on
0 Comments
Leave a Reply. |