Extract text from Microsoft Office PowerPoint presentations
Extract text from Microsoft Office PowerPoint presentations
Leave feedback
To extract a text from Microsoft Office PowerPoint presentations GetText and GetText(pageIndex) method is used. These methods allow to extract a text from the entire presentation or a text from the selected slide.
Here are the steps to extract a text from Microsoft Office PowerPoint presentations:
Instantiate Parser object for the initial presentation;
GetText method returns null value if text extraction isn’t supported for the document. For example, text extraction isn’t supported for Zip archive. Therefore, for Zip archive GetText method returns null. For empty Microsoft Office PowerPoint presentations GetText method returns an empty TextReader object (reader.ReadToEnd method returns an empty string).
The following example demonstrates how to extract a text from Microsoft Office PowerPoint presentation:
// Create an instance of Parser classusing(Parserparser=newParser(filePath)){// Extract a text into the readerusing(TextReaderreader=parser.GetText()){// Print a text from the presentationConsole.WriteLine(reader.ReadToEnd());}}
Here are the steps to extract a text from the sheet of Microsoft Office PowerPoint presentation:
Instantiate Parser object for the initial presentation;
The following example demonstrates how to extract a text from the slide of Microsoft Office PowerPoint presentation:
// Create an instance of Parser classusing(Parserparser=newParser(filePath)){// Get the document infoIDocumentInfodocumentInfo=parser.GetDocumentInfo();// Iterate over slidesfor(intp=0;p<documentInfo.PageCount;p++){// Print a page number Console.WriteLine(string.Format("Slide {0}/{1}",p+1,documentInfo.PageCount));// Extract a text into the readerusing(TextReaderreader=parser.GetText(p)){// Print a text from the presentation slideConsole.WriteLine(reader.ReadToEnd());}}}
The following example demonstrates how to extract a raw text from the slide of Microsoft Office PowerPoint presentation:
// Create an instance of Parser classusing(Parserparser=newParser(filePath)){// Get the document infoIDocumentInfodocumentInfo=parser.GetDocumentInfo();// Check if the document has slidesif(documentInfo==null||documentInfo.RawPageCount==0){Console.WriteLine("Document hasn't pages.");return;}// Iterate over slidesfor(intp=0;p<documentInfo.RawPageCount;p++){// Print a slide number Console.WriteLine(string.Format("Slide {0}/{1}",p+1,documentInfo.RawPageCount));// Extract a text into the readerusing(TextReaderreader=parser.GetText(p,newTextOptions(true))){// Print a text from the presentation slideConsole.WriteLine(reader.ReadToEnd());}}}
GroupDocs.Parser also allows to extract a text from Microsoft Office PowerPoint presentations as HTML, Markdown and formatted plain text. For more details, see Extract Formatted Text.
Here are the steps to extract a text from Microsoft Office PowerPoint presentation as HTML:
Instantiate Parser object for the initial presentation;
The following example shows how to extract a text from Microsoft Office PowerPoint presentation as HTML:
// Create an instance of Parser classusing(Parserparser=newParser(filePath)){// Extract a formatted text into the readerusing(TextReaderreader=parser.GetFormattedText(newFormattedTextOptions(FormattedTextMode.Html))){// Print a formatted text from the presentationConsole.WriteLine(reader.ReadToEnd());}}
More resources
GitHub examples
You may easily run the code above and see the feature in action in our GitHub examples:
Along with full featured .NET library we provide simple, but powerful free Apps.
You are welcome to parse documents and extract data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more with our free online Free Online Document Parser App.
Was this page helpful?
Any additional feedback you'd like to share with us?
Please tell us how we can improve this page.
Thank you for your feedback!
We value your opinion. Your feedback will help us improve our documentation.