Question about the structure of FDA Files (NOT AN ISSUE) #16

dakenblack · 2020-12-11T10:03:43Z

Hi Mark,

Great work on this project. Let me preface this by saying I don't have an issue with the OCT-Converter project, I am currently working on a project to extract some patient data from FDA files and I'm having a hard time finding the structure of the data.

I am currently trying to extract some patient data (name, eye side etc) from FDA files. The data seems to be in a chunk with the tag PATIENT_INFO_03, the uocte page (https://bitbucket.org/uocte/uocte/wiki/Topcon%20File%20Format) doesn't have any documentation on this chunk (only PATIENT_INFO_02).

I have some FDA files and some exported data (using Topcon's OCTDataCollector.exe) and doing a brute force search doesn't yield any matches either. I feel the data is encrypted but I can't be too sure.

My reason for posting here is that I'm hoping you might have come across this and know something about it.

Jabez

marksgraham · 2020-12-15T11:33:49Z

Hi Jabez,

We've had a bit of discussion about this on issue #13 - have you tried looking in
@FDA_FILE_INFO for patient name etc? I think laterality is encoded in the first byte of, @CAPTURE_INFO_02 too.

@antoniohupa did you have any luck finding these fields in the .fda? If you could share some code that would be great - I could incorporate into the main package.

Mark

antoniohupa · 2020-12-15T13:14:28Z

Hi Mark and Jabez

Yes, I could extract that information from FDA files (patient_id, eye, date of capture, etc). Patient's name is also easy to extract but I didn't dot it cause I need to work with anonymized data.
I'm too busy these days and I must be focused in other projects right now but as soon as I can I will share with you my code.

A

dakenblack · 2020-12-15T22:36:36Z

Hi Mark and Antonio,
Thanks for getting back to me. I will take a look at those chunks and see if I can find anything useful.

Looking forward to your code snippet as well.

Jabez

dakenblack · 2020-12-16T01:33:54Z

I had a look at the FDA_FILE_INFO chunk and this is what I see :

b'\x02\x00\x00\x00\xe0.\x00\x0010.1.5.48100\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

The number in the middle is the analysis software version, there is identification here.

I had a look at CAPTURE_INFO_02, I think you're right about the laterality. I need to confirm by looking at some more files though.

antoniohupa · 2020-12-17T11:02:17Z

Hi there

In @PATIENT_INFO_02 you will find the patient id, name, surname, gender, birth year, month, day ...
In @CAPTURE_INFO_02 you will find the eye in the first byte (x; 0 - right, 1-left) and the capture year, month, day .. I'm not completely sure the code for extract the eye is totally ok but reviewing the images with the ophtalmologists, it seems to match.

I've introduced new structures in the FDA class from Mark and a couple of functions to extract data of interest (see below). I'm pretty sure that this code will not work with fda files from other topcon models different from 3D OCT Maestro, which is the one I'm using. I'll try to get some fda files from other topcon models to adapt this code to them.

class FDA(object):
""" Class for extracting data from Topcon's .fda file format.

    Notes:
        Mostly based on description of .fda file format here:
        https://bitbucket.org/uocte/uocte/wiki/Topcon%20File%20Format

    Attributes:
        filepath (str): Path to .img file for reading.
        header (obj:Struct): Defines structure of volume's header.
        oct_header (obj:Struct): Defines structure of OCT header.
        fundus_header (obj:Struct): Defines structure of fundus header.
        chunk_dict (dict): Name of data chunks present in the file, and their start locations.
"""
def __init__(self, filepath):
    self.filepath = Path(filepath)
    if not self.filepath.exists():
        raise FileNotFoundError(self.filepath)
    self.header = Struct(
        'FOCT' / PaddedString(4, 'ascii'),
        'FDA' / PaddedString(3, 'ascii'),
        'version_info_1' / Int32un,
        'version_info_2' / Int32un
    )
    self.oct_header = Struct(
        'type' / PaddedString(1, 'ascii'),
        'unknown1' / Int32un,
        'unknown2' / Int32un,
        'width' / Int32un,
        'height' / Int32un,
        'number_slices' / Int32un,
        'unknown3' / Int32un,
    )

    self.oct_header_2 = Struct(
        'unknown' / PaddedString(1, 'ascii'),
        'width' / Int32un,
        'height' / Int32un,
        'bits_per_pixel' / Int32un,
        'number_slices' / Int32un,
        'unknown' / PaddedString(1, 'ascii'),
        'size' / Int32un,
    )

    self.fundus_header = Struct(
        'width' / Int32un,
        'height' / Int32un,
        'bits_per_pixel' / Int32un,
        'number_slices' / Int32un,
        'unknown' / PaddedString(4, 'ascii'),
        'size' / Int32un,
        # 'img' / Int8un,
    )
    
    self.patient_info = Struct(
        'Patient id' / PaddedString(32, 'u8'),
        'Patient given name' / PaddedString(32, 'utf8'),
        'Patient surname' / PaddedString(32, 'utf8'),
        'Zeros' / PaddedString(8, 'u8'),
        'Gender' / Int8un,
        'Birth year' / Int16un,
        'Birth month' / Int16un,
        'Birth day' / Int16un,
        'Birth year' / Int16un,
        'Zeros2' / PaddedString(502, 'ascii')
    )
    
    self.capture_date = Struct(
        'Eye' / Int8un,
        'y' / Int16un,
        'Zeros' / PaddedString(103, 'ascii'),
        'Year' / Int16un,
        'Month' / Int16un,
        'Day' / Int16un,
        'Hour' / Int16un,
        'Minute' / Int16un,
        'Second' / Int16un,
    )

    self.chunk_dict = self.get_list_of_file_chunks()


def get_list_of_file_chunks(self):
    """Find all data chunks present in the file.

    Returns:
        dict
    """
    chunk_dict = {}
    with open(self.filepath, 'rb') as f:
        # skip header
        raw = f.read(15)
        header = self.header.parse(raw)

        eof = False
        while not eof:
            chunk_name_size = np.fromstring(f.read(1), dtype=np.uint8)[0]
            if chunk_name_size == 0:
                eof = True
            else:
                chunk_name = f.read(chunk_name_size)
                chunk_size = np.fromstring(f.read(4), dtype=np.uint32)[0]
                chunk_location = f.tell()
                f.seek(chunk_size, 1)
                chunk_dict[chunk_name] = [chunk_location, chunk_size]
    print('File {} contains the following chunks:'.format(self.filepath))
    for key in chunk_dict.keys():
        print(key)
    return chunk_dict

def read_oct_volume(self):
    """ Reads OCT data.

        Returns:
            obj:OCTVolumeWithMetaData
    """

    if b'@IMG_JPEG' not in self.chunk_dict:
        raise ValueError('Could not find OCT header @IMG_JPEG in chunk list')
    with open(self.filepath, 'rb') as f:
        chunk_location, chunk_size = self.chunk_dict[b'@IMG_JPEG']
        f.seek(chunk_location) # Set the chunk’s current position.
        raw = f.read(25)
        oct_header = self.oct_header.parse(raw)
        volume = np.zeros((oct_header.height, oct_header.width, oct_header.number_slices))
        for i in range(oct_header.number_slices):
            size = np.fromstring(f.read(4), dtype=np.int32)[0]
            raw_slice= f.read(size)
            slice = decode(raw_slice)
            volume[:,:,i] = slice
    oct_volume = OCTVolumeWithMetaData([volume[:, :, i] for i in range(volume.shape[2])])
    return oct_volume

def read_patient_info(self):
    """ Reads Patient info

        Returns:
            patient name, surname, gender
    """

    if b'@PATIENT_INFO_02' not in self.chunk_dict:
        raise ValueError('Could not find OCT header @PATIENT_INFO_02 in chunk list')
    with open(filepath, 'rb') as f:
        chunk_location, chunk_size = self.chunk_dict[b'@PATIENT_INFO_02']
        f.seek(chunk_location) # Set the chunk’s current position.
        raw = f.read(615)
        patient_head = self.patient_info.parse(raw)

    return patient_head

def read_capture_date(self):
    """ Reads capture info

        Returns:
            eye and date of capture
    """

    if b'@CAPTURE_INFO_02' not in self.chunk_dict:
        raise ValueError('Could not find OCT header @CAPTURE_INFO_02 in chunk list')
    with open(filepath, 'rb') as f:
        chunk_location, chunk_size = self.chunk_dict[b'@CAPTURE_INFO_02']
        f.seek(chunk_location) # Set the chunk’s current position.
        raw = f.read(118)
        #num = int.from_bytes(raw, 'little')
        #out_hex = ['{:02X}'.format(b) for b in raw]
        date = self.capture_date.parse(raw)

    return date

Executing fda.read_patient_info() or fda.read_capture_date() you'll get what you need.

dakenblack · 2020-12-17T21:59:43Z

Hi Antonio,
Thanks for that but unfortunately my FDA files do not have a "PATIENT_INFO_02" it has "PATIENT_INFO_03" and as far as I can tell the data in my file does not have the same format as yours. I can get the capture date and I think I can get the eye laterality but not the patient ID, which is pretty important.

Jabez

antoniohupa · 2020-12-18T10:53:56Z

Hi Jabez

The same happens to me when I try to parse that information from fda files from Topcon Triton instead of 3D Maestro. "PATIENT_INFO_03" seems to be very messy. I have some fda files from a Triton identified with patient id, name, etc. Having these information I'll try to find them in the bytes but unfortunately I cannot right now. In the meanwhile, could explore some more in the data?

dakenblack · 2020-12-19T00:31:07Z

Yea, the files I've got are from a Triton as well. I found that most of the data in that chunk is exactly the same as the data found in other files. I've got to verify this for sure but I'm pretty sure the FDA files (I've compared) hold data for different patients so I shouldn't expect it to be so similar.

Thanks for your help, any assistance would be greatly appreciated. I'll continue to look through other files as well.

antoniohupa · 2020-12-21T10:06:02Z

That's right, almost all code in that chunk is the same between patients. The only differences are found in the first 4-5 bytes:

Patient id, 399047:

@PATIENT_INFO_03g\x02\x00\x00\xd2\x1bH"0\x196g\x0b\x8e <----
S\x90\xfe\xe6A\xcc\xab8\x9c\x0c\x8a\x023\xae\x11\xd0\x19\xc1\x0eL\xdc\x908\xd8\x1c\xe4I\x15\xf4Y\x0f\x16gz\xe4\xee\xb8\xa0\x16A\xf9g\xc4\xef\x81\x92ac\x9d\x9fP\xb3aa(\x0e8\xce\x0e=\x0be8\x91\x81\xbf\x199y\x8f\xbczT@1\x02\xf9\xc3\x03<\xd0\x81\x1f\x83\xd9-<\xbb\x16\x0e"\xa3\x8d>\x03\xa32\xd1\x1b~\xeaY\x11\xc2\n\x8a]\xa5\xa0tCv\xd1\xcb\xd8\xbd\xc4\x94\x8e\xf9w\x9ao\xcds0\x17\x958N\xb7K\xd1\xabHf\xc4\xd2\xfa\x95(\x934\x05\xc7\xa3\xc4.\xa6\x98kg\x1a\xf6\xef\xcdR\xf29\x880\x01~\xa9\xf6+\xce\xbb\x14\xcf\x04}\x10\x91(\x1e\xb6\xed\x19\xf1>l\xbc\x80Q\xda\xbe^T\xcd\xde\x83}\x1e\xcbF\x98\x8dg#\x07\x85\xb4u\x14\xf8A\x07>\xca@z\x0cR\xf7\xdf\x19A\xa67\xa1@\x1aC4\xd7\x8b\xac\xb5\xb2\xd7\xb0'\xd3O\xf0y,\x97\xc1] \tX<\x157K^\xc3\xf6\xf0Z\xcd\`\xab\xf2\xa2\xa8\x8e\xcb\xb3\x97h\xb4\xc13C&\xf1\n]\xd2\x88VW\x07\t(x\xe9\xd2\xd4\x18}o\xbc\x08\x92\x92k\xec!}\x91\xe2\x04\xe22\xa342\x14LMnB\xd3\xf5uk\xcb\xabuu>\xe4\x8d\xa0L\x9f\n\x10%\x0c\x9d#-\x82\xf1\x17\xf4/I\xa9\xf1\x1b\x98\xcc\x9e\xf8\xf8\xf3[H\xf31\xeb\xf1\x89\x1a\x1d\x1f[\xfdy\xcer\xe0>\xdf\x1fp\xd5\x86\x12\xd1=\xd2tep\x85<u^?c\x16\x89&3:\xfab\x11Ah@CQlC\x97\x94d\x9c\x19}M\xe4S\x93Nr\x1a,\xecdf\xa6\x95\xb3m\x06\xf6v{:Sa\xdc\x0e-o\xad\x9d\xe7\xc3\xf6a\x87\x81\x04\xd1\xdeF\xb7\x1f14Q\xbcR\x84)\x9a\xf9\x0b\xfe\xc4\x87U\xe4\x03C3!\x03\x126\xbb\x96y9\x13\xf9R.\xc4\x9ar\xd5\xff\xa2\xd5\xa52\x9f\\xb5\x9f\xc4l\xb9\xe0v:]\tCkRd\xb6\xe7\xc5\x17\x0c\xce\x94\x8c"\xca\xa6\xfe\x9b;\x11p\x92\xb3H\xc0\x90\xaf/t\xdb\x17\xa6\xa5K4\xc2S\x18\xce\xdf\xc7.\xb4A\xcb4V\xab\xed-\xc5:\xbc\x15N\x88\xfd_\x9b\xb0Y\xaf2\xf9\xcb\xb20\xe7\x98\xb4\xf9\xff\xd3\x9d\r\xce$\x9c\xfd\x1f\xafw\xc4\xac\xf5l\x07\xfc\x95fo\xfc\x00\x94\xbf\x8c\x1b\x0bs\x91\xf1\xd1\x9e\x05\xabtZD^\xda\x10

Patient id, 907034:

@PATIENT_INFO_03g\x02\x00\x00\xd8\x12F!0.6g\x0b\x8e? <---- S\x90\xfe\xe6A\xcc\xab8\x9c\x0c\x8a\x023\xae\x11\xd0\x19\xc1\x0eL\xdc\x976\xd8\x12\xe0iQ\xb1yC\x16gz\xe4\xee\xb8\xa0\x16A\xf9g\xc4\xef\x81\x92ac\x9d\x9fP\xb3a{$\x118\xdak5~~h+\xd0\xdb\xbf\x199y\x8f\xbczT@1\x02\xf9\xc3\x03<\xd0\x81\x1f\x83\xd9-<\xbb\x16\x0e"\xa3\x8d=\x03\xa33\xd1\x15~~\xeaY\x11\xc2\n\x8a]\xa5\xa0tCv\xd1\xcb\xd8\xbd\xc4\x94\x8e\xf9w\x9ao\xcds0\x17\x958N\xb7K\xd1\xabHf\xc4\xd2\xfa\x95(\x934\x05\xc7\xa3\xc4.\xa6\x98kg\x1a\xf6\xef\xcdR\xf29\x880\x01~\xa9\xf6+\xce\xbb\x14\xcf\x04}\x10\x91(\x1e\xb6\xed\x19\xf1>l\xbc\x80Q\xda\xbe^T\xcd\xde\x83}\x1e\xcbF\x98\x8dg#\x07\x85\xb4u\x14\xf8A\x07>\xca@z\x0cR\xf7\xdf\x19A\xa67\xa1@\x1aC4\xd7\x8b\xac\xb5\xb2\xd7\xb0'\xd3O\xf0y,\x97\xc1] \tX<\x157K^\xc3\xf6\xf0Z\xcd\`\xab\xf2\xa2\xa8\x8e\xcb\xb3\x97h\xb4\xc13C&\xf1\n]\xd2\x88VW\x07\t(x\xe9\xd2\xd4\x18}o\xbc\x08\x92\x92k\xec!}\x91\xe2\x04\xe22\xa342\x14LMnB\xd3\xf5uk\xcb\xabuu>\xe4\x8d\xa0L\x9f\n\x10%\x0c\x9d#-\x82\xf1\x17\xf4/I\xa9\xf1\x1b\x98\xcc\x9e\xf8\xf8\xf3[H\xf31\xeb\xf1\x89\x1a\x1d\x1f[\xfdy\xcer\xe0>\xdf\x1fp\xd5\x86\x12\xd1=\xd2tep\x85<u^?c\x16\x89&3:\xfab\x11Ah@CQlC\x97\x94d\x9c\x19}M\xe4S\x93Nr\x1a,\xecdf\xa6\x95\xb3m\x06\xf6v{:Sa\xdc\x0e-o\xad\x9d\xe7\xc3\xf6a\x87\x81\x04\xd1\xdeF\xb7\x1f14Q\xbcR\x84)\x9a\xf9\x0b\xfe\xc4\x87U\xe4\x03C3!\x03\x126\xbb\x96y9\x13\xf9R.\xc4\x9ar\xd5\xff\xa2\xd5\xa52\x9f\\xb5\x9f\xc4l\xb9\xe0v:]\tCkRd\xb6\xe7\xc5\x17\x0c\xce\x94\x8c"\xca\xa6\xfe\x9b;\x11p\x92\xb3H\xc0\x90\xaf/t\xdb\x17\xa6\xa5K4\xc2S\x18\xce\xdf\xc7.\xb4A\xcb4V\xab\xed-\xc5:\xbc\x15N\x88\xfd_\x9b\xb0Y\xaf2\xf9\xcb\xb20\xe7\x98\xb4\xf9\xff\xd3\x9d\r\xce$\x9c\xfd\x1f\xafw\xc4\xac\xf5l\x07\xfc\x95fo\xfc\x00\x94\xbf\x8c\x1b\x0bs\x91\xf1\xd1\x9e\x05\xabtZD^\xda\x10

I've making proofs but without results...

dakenblack · 2020-12-29T22:51:13Z

Hi sorry for the late response.
That is similar to what I'm seeing as well. Do you think the Triton has an internal data store that contains all the patient information? Cause I know it stores it somehow (since the Topcon application is able to export the data).

I did also have a look at other chunks, but nothing seemed obvious to me. maybe you might have better luck.

antoniohupa · 2020-12-29T23:17:19Z

Hi Jabez

Since 2017, at least in my hospital, triton and maestro export .fda with "patient_info_03" chunk, I guess due to a updated version. however, I have found that when images are stored in a folder, a filelist with patient data is stored too. That filelist contains all patient data of that folder images. I wrote a code to read that filelist and from it you are able to export patient' id, gender, laterality, date and hour of capture, name and surname. Take a look in order if you have this file too. Otherwise, it seems impossible to extract patient's info from that structure of data. If you have it too, I can share with you the code.

Greetings

dakenblack · 2020-12-30T05:55:04Z

I see, thanks for that. I'll have a look. Is this folder created by the triton when storing it internally or is it created by the OCTDataExtractor.exe application?

antoniohupa · 2020-12-30T11:24:27Z

I'm not sure. What I have is automatically stored. At least in my hospital, all fda files are stored in folders. Every folder contains a number of fda files and a filelist with the patient information of that fda files. I really don't know what octdataextractor.exe does but I can ask.

marksgraham · 2021-03-24T09:42:34Z

Going to close for now

witedev · 2024-07-02T20:52:58Z

@antoniohupa Hello, is there anything new related about the chunk information of @PATIENT_INFO_03? I need the structure, and I am facing a lot of issues.

Thanks in advance

marksgraham closed this as completed Mar 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the structure of FDA Files (NOT AN ISSUE) #16

Question about the structure of FDA Files (NOT AN ISSUE) #16

dakenblack commented Dec 11, 2020

marksgraham commented Dec 15, 2020

antoniohupa commented Dec 15, 2020

dakenblack commented Dec 15, 2020

dakenblack commented Dec 16, 2020

antoniohupa commented Dec 17, 2020

dakenblack commented Dec 17, 2020

antoniohupa commented Dec 18, 2020

dakenblack commented Dec 19, 2020

antoniohupa commented Dec 21, 2020 •

edited

Loading

dakenblack commented Dec 29, 2020

antoniohupa commented Dec 29, 2020

dakenblack commented Dec 30, 2020

antoniohupa commented Dec 30, 2020

marksgraham commented Mar 24, 2021

witedev commented Jul 2, 2024 •

edited

Loading

Question about the structure of FDA Files (NOT AN ISSUE) #16

Question about the structure of FDA Files (NOT AN ISSUE) #16

Comments

dakenblack commented Dec 11, 2020

marksgraham commented Dec 15, 2020

antoniohupa commented Dec 15, 2020

dakenblack commented Dec 15, 2020

dakenblack commented Dec 16, 2020

antoniohupa commented Dec 17, 2020

dakenblack commented Dec 17, 2020

antoniohupa commented Dec 18, 2020

dakenblack commented Dec 19, 2020

antoniohupa commented Dec 21, 2020 • edited Loading

dakenblack commented Dec 29, 2020

antoniohupa commented Dec 29, 2020

dakenblack commented Dec 30, 2020

antoniohupa commented Dec 30, 2020

marksgraham commented Mar 24, 2021

witedev commented Jul 2, 2024 • edited Loading

antoniohupa commented Dec 21, 2020 •

edited

Loading

witedev commented Jul 2, 2024 •

edited

Loading