Lots of years ago, i did some tiff file parsing my self at a binary stream/block level. I found it quite straight forward. There's of course a difference between decyphering the container (what a tiff file is at the highest level) and understanding and using all its possible data. My guess is dat a GeoTiff has some tagged fields in which coordinates can be extracted, and a larger blob where a whole png or jpg file resides. If thats the case, a high level parser is not needed. Or maybe the contained jpg and png data are without their own "header" and only raw image data. In that case an image decyphers is needed, which works without headers. That might be hard to find.
edit
here are my new thoughts. I examined the geotiffs i've got. They are uncomprompressed indexed images. No png or jpg. Just raw scanline data. I took a look at my own tiff parsing of ten years ago. It was some hundred lines of code, and could only read uncompressed, 4 samples per pixel images. To make it work for a few more formats it would propably tripple the coding effort. Locus could also limit the types of image data supported. There are a lot of tool which can convert to a specific tiff format. There are also tools who can put the geo stuff back in from sidecard files, if it's lost in conversion. Going forward more image types could be supported, to make it more user friendly.
edit
here are my new thoughts. I examined the geotiffs i've got. They are uncomprompressed indexed images. No png or jpg. Just raw scanline data. I took a look at my own tiff parsing of ten years ago. It was some hundred lines of code, and could only read uncompressed, 4 samples per pixel images. To make it work for a few more formats it would propably tripple the coding effort. Locus could also limit the types of image data supported. There are a lot of tool which can convert to a specific tiff format. There are also tools who can put the geo stuff back in from sidecard files, if it's lost in conversion. Going forward more image types could be supported, to make it more user friendly.