>I have a FileType class which I am using to get the file type of a file. The goal consists of loading the file into a string and analyzing the bytes of the string to detect if a known file type is recognized.
>
>In that FileType() method, I am loading the file into memory such as this:
>
>
> Using loStreamReader As New StreamReader(lcFile, System.Text.Encoding.Default)
> cString = loStreamReader.ReadToEnd()
> End Using
>
>
>In our daily operations, we process about 4000 zip files, where each of them contains about 30 or more files, where I have to get the file type of each. So, basically, this method is called about 120000 times on a daily basis. I see this is, of course, impacted by the size of the file. So, if the file is bigger, it will take more time to do the above.
>
>Would there be any faster way to achieve this?
I take it that you unzip the zip file and put the files it contains in a folder somewhere
To determine the file type of a file ( apart from looking at the extension) I do not think it is wise to put the contents in a string
I think you have to process the 'raw' contents of the file
Do not use a StreamReader which decodes the bytes into chars depending on an encoding
but a FileStream instead (
https://msdn.microsoft.com/en-us/library/tyhc0kft(v=vs.110).aspx )
You can either
- read all the bytes in a byte array
- or use Seek() and read the parts you are interested in with Read() or ReadByte()
Gregory