
Sign up to save your podcasts
Or
Did you know companies can manage their legal documents using NLP and AI?
So imagine that you have all kinds of PDF docs, some of which may have been scanned, so after that, you need to understand the scan itself, which takes time as well.
There is a technology called Optical Character Recognition, OCR and that's often used to get that data in and turn it into printed documents, basically understandable computer documents.
You now have text, which is a big important piece, but you need to understand the content. Your NLP and AI program are gonna have to do some kind of whizzbang magic to basically understand what's going on in all of that content.
Most good programs will generate internal metadata about the document or the content itself, it might even go down to the bolded words being there for emphasis and the italicized words being there for another reason.
That can also lead to an adjacent to summarizing, so you might wanna summarize the document so that you don't have this a hundred-page. The idea is to get the general concept of the whole document.
5
22 ratings
Did you know companies can manage their legal documents using NLP and AI?
So imagine that you have all kinds of PDF docs, some of which may have been scanned, so after that, you need to understand the scan itself, which takes time as well.
There is a technology called Optical Character Recognition, OCR and that's often used to get that data in and turn it into printed documents, basically understandable computer documents.
You now have text, which is a big important piece, but you need to understand the content. Your NLP and AI program are gonna have to do some kind of whizzbang magic to basically understand what's going on in all of that content.
Most good programs will generate internal metadata about the document or the content itself, it might even go down to the bolded words being there for emphasis and the italicized words being there for another reason.
That can also lead to an adjacent to summarizing, so you might wanna summarize the document so that you don't have this a hundred-page. The idea is to get the general concept of the whole document.