Use LLMs To Extract Data From Text (Expert Mode)

Video quality	The size	Download

Information Use LLMs To Extract Data From Text (Expert Mode)

Title	:	Use LLMs To Extract Data From Text (Expert Mode)
Lasting	:	15.28
Date of publication	:
Views	:	62 rb

Great video, Greg! I was thinking, would using a tool like HasData complement this method by scraping various text datasets for training models?
Comment from : @kareemgates1895

m'ke?
Comment from : @EranMoshe-y9h

I'm having some problems running this with Ollama local models (I tried llama 31 and nuextract) and it's not working The output has lot of repetitive info
Comment from : @JustDoIt-pl2sl

Does anyone know how to do this with an LLM model loaded from transformers?
Comment from : @constandinosk3251

Hey Greg, thanks for this video!brSince, there is a limit to access open ai api key without paying, how can the above implementation be carried out with other open source LLMs ?
Comment from : @SudhakarVyas

Fantastic tutorial! It would be great to see another tutorial using "transformers" instead of openai with chroma or any local database and how will you save the extracted information does Kor tokenize that information, etc?
Comment from : @fabsync

This was an awesome content
Comment from : @Ideariver

Great introduction Perfect pacing I’m going to do some further research to see if I can figure out a way to use Kor with a local language model since I deal with confidential patient data in a healthcare setting
Comment from : @steveadams617

Can anyone help me with this error [initial_value must be str or None, not dict], while executing chainpredict and parse
Comment from : @mvasanth5200

Thanks, this was super useful! I would love to get some insight into the feedback you got from those 80 companies
Comment from : @jakobkristensen2390

Thanks Greg, this was really helpful!
Comment from : @brightstartdaily

Can we extract important contents from research paper ? like some text from abstract and some from results or ablation table present Can you make one video about it as how to customize that text extraction to google sheets
Comment from : @pooja1124

Wow!! it's magic
Comment from : @manujkumarjoshi9342

Is there an existing tool that is cutting low-signal text?
Comment from : @wiktorm9858

hi Greg, thank you for the great video! How would you go about extracting "tags" or predefined values an not String texts? Especially if the number of values ar in the thousands and are too many to just feed into the prompt (token optimization etc) Any ideas? Thank you!
Comment from : @pocker91

Hi may I know if it is working with LinkedIn?
Comment from : @Teathebest0

very insightful - thank you
Comment from : @mahroushkagaurav3601

can i use it to extract events from the text using hugging face or any other open source llm model?
Comment from : @muhammadowaissiddiqui2443

Is there a way to read an entire PDF with Langchain and Kor?
Comment from : @TonyHoangPodcast

Incredible Question of 1 million dollars 😊: How to "teach" chatgpt just 1 time what the schema is and be able to validate infinite texts without having to spend a token inputting the schema at the prompt and without having to train the model via fine-tune?
Comment from : @eduardomoscatelli

damn, thats cool
Comment from : @dprggrmr

Finally a video that i can enjoy without that backgroud noise , thanks a lot and please continue without background music
Comment from : @abdoualgerian5396

It's just too expensive to offer a viable product with OpenAIbrAda-002 is $00004 per 1K tokens
Comment from : @greendsnow

you painted!
Comment from : @rolenle8794

How can I extract the data from an API output as JSON?
Comment from : @AditiTambi-y8g

This looks like a really interesting approach @DataIndependent any ideas of what the best approach for using tabular data (whether from a pandas dataframe, pyspark dataframe or SQL data table) in conjunction with LLMs? What about combining tabular data with text documents?
Comment from : @davidmichaelcomfort

Hey Greg, you sure this doesn't work well with GPT-35?
Comment from : @mysticaltech

Hello, how to connect langchain not to chatgpt but to local chat-bots by their local-host names?
Comment from : @vanamonde_8809

awesome
Comment from : @Ryan-yj4sd

Newbie here, I don’t understand why you would need to use the library for this task? Couldn’t you just include in your llm prompt to specify the exact output and formatting you need? Cheers!😊
Comment from : @yellowboat8773

You're channel is gold! Thanks a lot for all those tutorials
Comment from : @DelaLange

Greg, great video as always! I achieved the same results by including the desired output in JSON format along with the initial prompt itself, without using the Kor library
Comment from : @pradeepthiyyagura8677

This is a wrong approach imho You have to use output as a text and not as an object If you do that, you lose the ability to stream the output which is a main feature of these LLM If you want to structure your text, you'll have to go with MD (mark down) Not to mention also that the translation in object is never deterministic due to the nature of LLM and you could get something unusable for your front end
Comment from : @Grahfx

Thanks Greg, this is very relevant, will give Kor a try!
Comment from : @tomwalczak4992

Come on why did you steal my idea 😅 I was literally thinking how to scrape a youtube channel's data usung llms I was looking for the info You came right on time!
Comment from : @ChatGPT-ef6sr

I really liked your video "The Data Learning Journey (Part 1)", and am hoping you will post Part 3 soon
Comment from : @AB51002

Is that a few shot NER ? 🤔
Comment from : @catyung1094

I think we'll have mor such prompt based tooling available sooner or later Any other specific tools you are experimenting with?
Comment from : @rajpdus

If you were to extract relationship info (which I gather you could, especially in your first example), I could see creating a link diagram from this info (eg with Neo4J)
Comment from : @gridplan

pip install kor? his document doesn't specify
Comment from : @thorthumb0031

Interesting, I'm going to give this a go I've experimented with pydantic for parsing llm output into json so this super relevant right now Thanks Greg, great explainer as always
Comment from : @ac_cobra8540