Reading fields from PDF - Level Extreme

Plateforme Level Extreme

Abonnement

Profil corporatif

Produits & Services

Support

Légal

English

Reading fields from PDF

Message

11/11/2011 11:05:05

Charles Hankey
Consultant
Cleveland, Ohio, États-Unis

11/11/2011 10:44:59

Michel Fournier
Level Extreme Inc.
Petit-Rocher, Nouveau-Brunswick, Canada

Information générale

Forum:

ASP.NET

Catégorie:

Autre

Titre:

Re: Reading fields from PDF

Versions des environnements

Environment:

VB 9.0

OS:

Windows 7

Network:

Windows 2003 Server

Database:

MS SQL Server

Application:

Web

Divers

Thread ID:

01528652

Message ID:

01528656

Vues:

I use iTextsharp which is a free csharp port form java for discovering fieldnames on PDF forms and for filling the forms programmatically from data. Works beautifully and I have a lot of VB code you are welcome to that might save you some time.

discovering the fieldnames for a form is a single line of code. I haven't done it, but I believe there is something equally easy to read field contents.
.
But this is where to start :

http://www.codeproject.com/KB/graphics/iTextSharpTutorial.aspx

If you search UT on iTextsharp you may find some links I put in previously to good samples etc., do not have that at hand right now, but you won't have any trouble finding docs.

I have an app that fills in hundreds of PDF forms from data. The nice part is you can throw all the data at every form and only the fields that match will get filled so you can have one data engine pointed at all your forms.

>Right now, we are using a very primitive PDFToTXT product from http://www.verypdf.com, which has been abandoned and taken over by another party which doesn't show any intention of offering technical support. As a matter of fact, the new site shows "Live Chat Offline", yes, you read that well "Live Chat Offline".
>
>This utility converts to TXT and then we apply all kinds of logic to parse whatever we need. Of course, we have problems with French characters sometimes as it only converts them at 75%. It seems their 2007 version has not been updated for new PDF format. Also, we end up with page markers and weird characters in the parsing and this makes our work very difficult to obtain a clean parsing of the data we want.
>
>So, is there a .NET PDF reader DLL we can have where we would be able to obtain a more advanced way of extracting the fields we want. As, I assume a PDF much have field names hidden. So, with a more advanced utility, I assume we could just use the PDF field name and obtain the value as is.

Charles Hankey

Though a good deal is too strange to be believed, nothing is too strange to have happened.
- Thomas Hardy

Half the harm that is done in this world is due to people who want to feel important. They don't mean to do harm-- but the harm does not interest them. Or they do not see it, or they justify it because they are absorbed in the endless struggle to think well of themselves.

-- T. S. Eliot
Democracy is two wolves and a sheep voting on what to have for lunch.
Liberty is a well-armed sheep contesting the vote.
- Ben Franklin

Pardon him, Theodotus. He is a barbarian, and thinks that the customs of his tribe and island are the laws of nature.

Répondre

Fil

Voir

Click here to load this message in the networking platform