Question 1---> Currently i am working on a project where in we translate the English content to other 17 languages. To reduce the translation cost currently we are using MD5 hashcode and based on the results we decide whether the topic is New(Master) or Translated Earlier(Obselete). But the logic is so much complicated and we want to reduce the complexity by some level. Also currently we are using content management system as Filenet and is way too older..:) Basically i need best suggestion for Content de-duplication apart from the MD5 hashing
Note :- Topic means an XML file with images and is rendered via XSLT and is not a DITA standard.
Question 2--->
What is best alternative to render the non-standard XML file or not a DITA standard XML file on UI like an HTMl or PDF.?
Thanks in adavance ...Waiting for best suggestions.
Question 1
I recommend to not rely on hashes or time stamps, but that depends on your environment. If you refactor variables, change indentation add/remove comments, etc. what does not change the content and should not trigger a translation process, you may than rely on metadata to trigger a semi-automatic process. Further on, you could use a diffing mechanism to compare the current version of a document to an earlier one.
Question 2
As the first question, this one is hard to answer without knowing your environment, too. Probably it is smarter to firstly convert your files to DITA or Markdown and than use the DITA-OT or a Markdown processor for further transformation.