Read text from PDF using iTextSharp

Read text from PDF using iTextSharp – I started my day searching for a solution on how to read PDF files, and finally, I was able to search for a solution. Here is a solution from stack overflow.
I am posting this to help readers from this blog who have also searched for this problem.

Credit to stackoverflow:

Link: https://stackoverflow.com

Here’s a code found on how to Read text from PDF using iTextSharp.

 public string ReadPdfFile(object Filename)
{

string strText = string.Empty;
try
{
PdfReader reader = new PdfReader((string)Filename);

for (int page = 1; page <= reader.NumberOfPages; page++)
{
ITextExtractionStrategy its = new iTextSharp.text.pdf.parser.SimpleTextExtractionStrategy();
String s = PdfTextExtractor.GetTextFromPage(reader, page, its);

s = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(s)));
strText = strText + s;

}
reader.Close();
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
return strText;
}

If you are using Visual Studio, you can add a reference to ItextSharp using Manage Nuget Packages. To open Manage Nuget Packages, go to your solution explorer, right-click on your project name, then scroll down and find Manage Nuget Packages. Search for ItextSharp, by simply typing ItextSharp then press enter on your keyboard. If the search result appears, select the first option and click the install button on the result’s right side.

Below is the image of ItextSharp from the Manage NuGet Packages option. Reading PDF File

After successfully adding this reference you can now use it by adding this reference from your code.

  • using iTextSharp.text.pdf;
  • using iTextSharp.text.pdf.parser;

And now, you can already use iTextSharp from your code.

Thank you for reading!

Hopes that this simple solution can help you.

Happy coding!! Visit my Blog page for my latest post.

Also check Digital Software.