Results 1 to 13 of 13

Thread: File Format reader

  1. #1

    Thread Starter
    Hyperactive Member
    Join Date
    Jul 2002
    Posts
    481

    File Format reader

    Hello Group,

    I am thinking about creating a converter that can take other software file formats and convert it so my software can use the data

    It appears most other files are in binary format. Does anyone have suggestion on the fastest way to see how data is stored in binary files(in what order, which data types). I have done this before, but I did it manually, each iteration I tired a datatype, got the next variable in the file an determined its use/type.

    any suggestions are welcome.

  2. #2
    The Idiot
    Join Date
    Dec 2014
    Posts
    2,721

    Re: File Format reader

    theres thousands of formats.
    what kind of file format? graphical, databases, scripts, drivers, documents, modules, sounds etc?

  3. #3
    Fanatic Member
    Join Date
    Feb 2017
    Posts
    858

    Re: File Format reader

    Years ago Ziff Davis Publishing put out a free copy of FileSnoop that allowed one to view every file in either Text or Binary. If I recall correctly I found it in PC Magazine.
    Never found a file format it could NOT view. My guess is that if one looks at a HEX editor one could get the same binary output. So personally don't see where the file
    format is an issue other than if one for example wants to show/convert to only Text
    such as a MS Word doc file.

  4. #4

    Thread Starter
    Hyperactive Member
    Join Date
    Jul 2002
    Posts
    481

    Re: File Format reader

    Quote Originally Posted by baka View Post
    theres thousands of formats.
    what kind of file format? graphical, databases, scripts, drivers, documents, modules, sounds etc?
    its flat file binary for a robotics control software.

  5. #5
    The Idiot
    Join Date
    Dec 2014
    Posts
    2,721

    Re: File Format reader

    if u already know the file format, this "robotics control software", what u need to know is the header/protocol of this format.
    all formats has some kind of structure they follow, some with a specific header, some without as they go directly into data-storing.

    when u know that, it easily enough to create a parser that will convert the raw-data into a structure.
    some softwares are also compressing and/or encrypting the data. if so, u need to know what they use and create a decompressor/decrypter before you can parse.

    an example is *.swf format (flash) with a specific header (non-compressed) that tells u if the flash file is compressed or not, (theres 3 types), in the header u also get the filesize and some other information like version.
    so here u need to read the header, and after that uncompress if needed and after that follow a protocol how to read the data.

  6. #6

    Thread Starter
    Hyperactive Member
    Join Date
    Jul 2002
    Posts
    481

    Re: File Format reader

    Quote Originally Posted by baka View Post
    if u already know the file format, this "robotics control software", what u need to know is the header/protocol of this format.
    all formats has some kind of structure they follow, some with a specific header, some without as they go directly into data-storing.

    when u know that, it easily enough to create a parser that will convert the raw-data into a structure.
    some softwares are also compressing and/or encrypting the data. if so, u need to know what they use and create a decompressor/decrypter before you can parse.

    an example is *.swf format (flash) with a specific header (non-compressed) that tells u if the flash file is compressed or not, (theres 3 types), in the header u also get the filesize and some other information like version.
    so here u need to read the header, and after that uncompress if needed and after that follow a protocol how to read the data.
    I do not know the file format, that is what I am looking for a utility that can automatically find at least the order of data types stored in the file

    if anyone has a copy of filesnoop please share, it's nowhere to be found

    thanks

  7. #7
    PowerPoster Arnoutdv's Avatar
    Join Date
    Oct 2013
    Posts
    5,872

    Re: File Format reader

    There are no generic tools for finding a file structure.
    It can literally be anything

  8. #8
    PowerPoster Elroy's Avatar
    Join Date
    Jun 2014
    Location
    Near Nashville TN
    Posts
    9,853

    Re: File Format reader

    I'll jump in here just because I'm bored.

    Quote Originally Posted by axisdj View Post
    It appears most other files are in binary format.
    Axisdj,

    The use of the word "most" in that sentence bugs me each time I read it. In the most basic sense, ALL files are in a binary format. Again, speaking at the basic level, ALL files are composed of ONEs and ZEROs, which is what binary is.

    Some may want to argue that ASCII isn't binary, but that is just not correct. ASCII is binary just like any other "format" is binary, it's still ONEs and ZEROs. It's just that, if it's pure ASCII, every eighth bit will always be ZERO, and the other seven bits will represent values on the ASCII table.

    So, as we see, ALL files are binary.

    When you say what "format" are they, you're really asking "what program can read a particular binary file". For instance, we have files that Notepad can read. Those would include ASCII, UTF-8, and UTF-16, with the UTF formats either containing the BOM header or not. So, there's a few different "formats". And, just to say it, UTF formats will use that 8th bit.

    And, if we just stick with the ASCII or UTF files, we can easily delve into sub-"formats". These might include things like HTML, XML, DAE (collada), INI, BAS, CLS, FRM, C, C++, CS, and literally millions of other "text readable" files that store all kinds of information.

    Now, let's go to non-ASCII non-Unicode files. Again, as was suggested above, there are millions of possibilities, and a hex editor is going to be the only thing that can read them all ... because a hex editor doesn't care what the "format" is. But, just to rattle off a few, there's any ZIP file (which would include the DOCX, XLSX, PPTX, etc files), EXE files, DLL files (or any compiled program), C3D files (which I use for motion capture), any file written with IEEE or 2's compliment numbers in it, virtually any picture-type file, or just about anything we want.

    Now many files will have a "header" in them which helps to identify them, but there's no mandate about this. Some files will just be identified by their file extension. And there is absolutely no standard for what these headers are and how they're used. And, once we start thinking about it, there are 1000s (if not millions) of software programs out there, most of which write some kind of file(s). On my taskbar, I've got SPSS (which writes SAV files) ... these are yet another file type.

    Without having some idea what software wrote the file (and, in many cases, what version of the software wrote it), it's virtually impossible to know the "format" of the binary file. If you know it came from an identified handful of software programs, you might be able to start to guess, and probably be pretty good at it. But, to say you're going to identify all file formats? That's wishful thinking, and essentially impossible, in addition to it being a constantly moving target.

    Good Luck,
    Elroy
    Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.

  9. #9
    PowerPoster
    Join Date
    Nov 2017
    Posts
    3,116

    Re: File Format reader

    Reminds me of a frustrating experience at a past job. I got a call from my boss who was at a workstation trying to help a user import a file into some medical program. The file wouldn't import and the program was throwing an error of "Invalid dat file format" or something similar. So my boss calls me up and explains the situation and asks what the correct format of a dat file is. I paused and answered the correct answer "Whatever format the program expects it to be in". And she just didn't get it. This wasn't a program I had ever even heard of or seen, so I had no idea what a "proper" import file was supposed to look like, and neither did she. But in her mind, apparently a "dat" file had some sort of formal or quasi-formal formatting spec, like .zip, or .pdf, and my attempt to explain to her otherwise was falling on deaf ears.

    Memories...

    Good luck axisdj.

  10. #10
    The Idiot
    Join Date
    Dec 2014
    Posts
    2,721

    Re: File Format reader

    what we can do is to narrow it down to specific formats.
    I remember doing a program in dos using turbo pascal that would recognize a couple of different picture formats, a bunch of text-files/documents, snd/wav/mod (different music/sound data)
    sure it would be like 10 or so formats only, but the most used and stuff that I needed. if there was a new format that I needed I would add that to the database.

    what u need to do is the same,
    pick the formats u want your tool to recognize and figure out the header/protocol.
    usually its quite easy to do, but, some formats differ depending version and if theres a compression or not. but doable if u spend enough time analyzing the format.

    a lot of times u can find information about the header using google. research is important.
    if none, u will need to do it yourself, by analyzing the header by testing multiple files and see if theres a similarity.

  11. #11
    The Idiot
    Join Date
    Dec 2014
    Posts
    2,721

    Re: File Format reader

    if u are talking about "ripping tools", thats another question.
    a tool that would "read" the data, to find pictures, readable strings, music/sounds etc, compressions etc.
    this is kind o similar to reading the header, but u are reading the whole data and find if theres "headers" within the data that u can extract.
    so, its all about headers, compressions and decryptions.

  12. #12
    Fanatic Member
    Join Date
    Feb 2017
    Posts
    858

    Re: File Format reader

    axisdj:
    Regarding copy of FileSnoop, you might check the wayback machine and see if a copy exits. Copy write restrictions prohibit distribution. Here's the first two paragaphs of the readme file that may give some insight:

    FileSnoop lets you access all essential information about the files on your system. It offers up to four views of each file: Information, Formatted view, Text view, and Hex view. The Information view gives, at minimum, basic file information such as type, size, and attributes. There are also special information sections that vary depending on the type of file. For example, information on an image file will include the height, width, and other image details. Formatted view is available only for certain files, including RTF, HTML, AVI, and various types of sound, bitmap, and icon files. This view displays images and plays sound files and animations.Text view lets you view the file as text, with the option to strip out nonprinting characters. With this feature you can, for example, scan all the error messages in an EXE file. The Hex view shows the file in hexadecimal.

    FileSnoop runs under Windows 95, 98, Me, NT 4.0, 2000, and XP. It was written using Delphi 6 with the Update 2 patch applied. The source code for FileSnoop is provided with the utility for those interested in seeing how it works. Note that PC Magazine programs are copyrighted and cannot be distributed, whether modified or unmodified. Use is subject to the terms and conditions of the license agreement distributed with the programs.

  13. #13
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    8,598

    Re: File Format reader

    Quote Originally Posted by OptionBase1 View Post
    Reminds me of a frustrating experience at a past job. I got a call from my boss who was at a workstation trying to help a user import a file into some medical program. The file wouldn't import and the program was throwing an error of "Invalid dat file format" or something similar. So my boss calls me up and explains the situation and asks what the correct format of a dat file is. I paused and answered the correct answer "Whatever format the program expects it to be in". And she just didn't get it. This wasn't a program I had ever even heard of or seen, so I had no idea what a "proper" import file was supposed to look like, and neither did she. But in her mind, apparently a "dat" file had some sort of formal or quasi-formal formatting spec, like .zip, or .pdf, and my attempt to explain to her otherwise was falling on deaf ears.

    Memories...

    Good luck axisdj.
    I got quite a few odd requests like that over my life. People always want to read data from all kinds of devices and it's always a run around when I ask them for the format of the device's output/input. When I try to obtain this information myself, it's always a task because more often than not the device was made by some backwater company who may or may not have a barely functional website with inaccurate or incomplete specs. I don't bother trying anymore if it's clear I cannot obtain the format of device's input/output or I have to be some kind of CIA code cracker to figure out whatever cryptic document they have on the device's formats. I tell you man, it rough out here.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width