Reformat xattr output and store it in MySQL using a BASH script

170 Views Asked by At

I have a script that collects a bunch of file system object information (hashes, dates, etc) and stores it in a MySQL database (one row per object). The script is running in Bash in Mac OS X 10.10.4 (MBP).

I would like to store the HFS+ Extended Attributes in the database as well. xattr gives output as shown below, I would like to dump the hex and formatting text leaving just the attribute name and the ASCII value. This means not just dumping the line numbers, hex, and | formatting characters but also concatenate the value onto one line per attribute name with the attribute name prepended. Note that each object (file/folder) may have multiple attributes and the attribute names are not defined.

Take this input:

$xattr -l wordpress-3.9.6.zip 
com.apple.metadata:kMDItemWhereFroms:
00000000  62 70 6C 69 73 74 30 30 A2 01 02 5F 10 29 68 74  |bplist00..._.)ht|
00000010  74 70 73 3A 2F 2F 77 6F 72 64 70 72 65 73 73 2E  |tps://wordpress.|
00000020  6F 72 67 2F 77 6F 72 64 70 72 65 73 73 2D 33 2E  |org/wordpress-3.|
00000030  39 2E 36 2E 7A 69 70 5F 10 2F 68 74 74 70 73 3A  |9.6.zip_./https:|
00000040  2F 2F 77 6F 72 64 70 72 65 73 73 2E 6F 72 67 2F  |//wordpress.org/|
00000050  64 6F 77 6E 6C 6F 61 64 2F 72 65 6C 65 61 73 65  |download/release|
00000060  2D 61 72 63 68 69 76 65 2F 08 0B 37 00 00 00 00  |-archive/..7....|
00000070  00 00 01 01 00 00 00 00 00 00 00 03 00 00 00 00  |................|
00000080  00 00 00 00 00 00 00 00 00 00 00 69              |...........i|
0000008c
com.apple.quarantine: 0001;55701556;Google Chrome.app;8AD80928-CB48-48EA-8A1B-EC4B0BE656A9

And make it look like this:

com.apple.metadata:kMDItemWhereFroms: bplist00..._.)https://wordpress.org/wordpress-3.9.6.zip_./https://wordpress.org/download/release-archive/..7...............................i
com.apple.quarantine: 0001;55701556;Google Chrome.app;8AD80928-CB48-48EA-8A1B-EC4B0BE656A9

Thanks for any help

MC

1

There are 1 best solutions below

1
On

xattr is not very customizable; it's meant more for human browsing than scripted use. You're better off using another language. Here's an example in Python:

import xattr
x = xattr.xattr('wordpress-3.9.6.zip')
for name, value in x:
     print name, repr(x[name])

You may want to drop the call to repr (or use a different wrapper around x[name]), depending on the desired output.

Note that you almost certainly do not want the . from the ASCII output of the xattr program, since they represent any non-printable ASCII character.