Optimizing research on file type

Plateforme Level Extreme

Abonnement

Profil corporatif

Produits & Services

Support

Légal

English

Optimizing research on file type

Message

14/01/2016 10:42:57

Gregory Adam
Belgique

14/01/2016 08:58:22

Michel Fournier
Level Extreme Inc.
Petit-Rocher, Nouveau-Brunswick, Canada

Information générale

Forum:

ASP.NET

Catégorie:

Autre

Titre:

Re: Optimizing research on file type

Versions des environnements

Environment:

VB 9.0

OS:

Windows 8.1

Network:

Windows 2008 Server

Database:

MS SQL Server

Application:

Web

Divers

Thread ID:

01629800

Message ID:

01629804

Vues:

This message has been marked as a message which has helped to the initial question of the thread.

>I have a FileType class which I am using to get the file type of a file. The goal consists of loading the file into a string and analyzing the bytes of the string to detect if a known file type is recognized.
>
>In that FileType() method, I am loading the file into memory such as this:
>
>

>				Using loStreamReader As New StreamReader(lcFile, System.Text.Encoding.Default)
>					cString = loStreamReader.ReadToEnd()
>				End Using
>

>
>In our daily operations, we process about 4000 zip files, where each of them contains about 30 or more files, where I have to get the file type of each. So, basically, this method is called about 120000 times on a daily basis. I see this is, of course, impacted by the size of the file. So, if the file is bigger, it will take more time to do the above.
>
>Would there be any faster way to achieve this?

I take it that you unzip the zip file and put the files it contains in a folder somewhere

To determine the file type of a file ( apart from looking at the extension) I do not think it is wise to put the contents in a string

I think you have to process the 'raw' contents of the file

Do not use a StreamReader which decodes the bytes into chars depending on an encoding

but a FileStream instead ( https://msdn.microsoft.com/en-us/library/tyhc0kft(v=vs.110).aspx )

You can either
- read all the bytes in a byte array
- or use Seek() and read the parts you are interested in with Read() or ReadByte()

Gregory

Répondre

Fil

Voir

Click here to load this message in the networking platform