I’m looking to find a library and/or guide that would allow me to encode an image with DCT (discrete cosine transform ) so I can place it in a basic 1.0 pdf file. (FYI I’m using https://git.catseye.tc/pdf.lua/ to create the pdf.
I’ve search the internet for something’s but couldn’t find anything is anyone on SO aware of something using Lua to encode an JPEG with DCT..
Update:
Based on feedback, here’s some additional information on my ask
If you open up a PDF file, the stored JPEG data will appear in the XObject image. Here is an example.
14 0 obj
<<
/Intent/RelativeColorimetric
/Type/XObject
/ColorSpace/DeviceGray
/Subtype/Image
/Name/X
/Width 2988
/BitsPerComponent 8
/Length 134030
/Height 2286
/Filter/DCTDecode
>>
stream (binary data) endstream
The /Type shows that this is an image. The key section is the /Filter value – DCTDecode , which indicates a JPEG (JPX shows a JPEG2000) which also works. The data i need is to go between stream and endstream.
I’m looking for help in how I can get an image converted into the DCT format needed..
The prime difference for DCT/JPG in PDF is that the .jpeg in a PDF must be "baseline" much as it was in 1992 see also (https://ia801003.us.archive.org/5/items/pdf320002008/PDF32000_2008.pdf#page=42) and that's what MS paint (or any command driven graphics app) will save as "simple" .jpeg (not any exotic type) so here on the right is the everyday.jpeg from MSPaint conversion from PNG or any other complex format, and here is the exact same /DCTdecode object when imported by a PDF writer, on the left.
So If we export the image from the PDF we will get the Jpeg (not the source PNG). How to check they are identical is copy and paste or use extractor.
So the image.jpg used for my cmd line wrap as a pdf is
5,757 bytesthe extracted from PDF image is5,757 bytes, thus we can expect a match.Check they are the identical binary files (What goes in, comes out, very rare for a PDF)
So to make a page PDF from an image you simply need a header
where a windows command line or any other script language, can write that last line with the correct values. And a trailer, which is where it may then get messy. So as much of the tail was moved to the head to keep the trailer writing minimal. I have done similar cmd line embedding for Video and Audio, so DCT (Jpeg) images should not be a problem. (except I prefer lossless pixel perfect PNG and that's way harder).
here is a matching trailer for the header above
You simply need to ensure the startxref is correct
So the working program is first use any graphics app to prep the width height and length and apply the dimensions and thus offset to end of header and trailer then briefly
Since Jpg is a binary compressive encoding, you cant use any plain text copy and paste as it destroys the highest 8th bit of each byte corrupting the jpeg, hence its the pants for building in a textual fashion. Thus needs binary sandwich between the 2 text parts hence
copy /b[Later Edit]
I gave a fairly complex value above for object 5, that can be simplified so say we have an image to be scaled as 500 pt by 477 pt and we want it centred, we can offset use by half of the extra width and half the extra height so simplifieed to
W 0 0 H dx/2 dy/2where dx is the width of whitespace and similar for dy height.[Even LATER edit] For a different question I revisited the methods needed to use a simpler cmd file to automate a single pixel perfect jpg addition. It is not much different to above and needs some spit and polish for production. However it shows how to automate for various source images and can be bettered for a set of images in a loop, but its a start point.

A demo working set can be found here https://github.com/GitHubRulesOK/MyNotes/blob/master/jpgTOpdf.zip