.net - get original content of a pdf signed with itextsharp -
i'm trying original document of signed pdf in order compare it's hash stored doc.
this easy when document has several signatures, acrobat reader can go previous revision of document save , that's it.
surprisingly not work first signature, there no straight forward way original data.
as not possible reader have tried programatically itextsharp. although have googled have not found how it. relevant post found one no solution offered.
has faced problem , found solution?
thanks in advance.
edit: put here code extracts data based on response of mkl. read comments of response beware of problem unfixed length of non signed pdfs.
string soriginaltext = file.readalltext("filesigned.pdf", encoding.default); int strailernumberposition = soriginaltext.lastindexof("]/prev ") + "]/prev ".length; int strailernumberendposition = soriginaltext.indexof(">", strailernumberposition); string strailerindex = soriginaltext.substring(strailernumberposition, strailernumberendposition -strailernumberposition); int itrailerindexposition = soriginaltext.indexof(strailerindex + "\r\n%%eof"); int iendposition = soriginaltext.indexof("%%eof", itrailerindexposition) + "%%eof".length; string souttext = soriginaltext.substring(0, iendposition); file.writealltext("c:/originalfile.pdf", souttext, encoding.default);
whether or not task to original document of signed pdf realizable @ all, depends on how signature applied.
if signature applied in append mode (i.e. according language of pdf specification iso 32000-1:2008 incremental update, cf. section 7.5.6), merely have cut off appended, incremental update revision.
as have stored document presumably after signing has become document inspect, can cut signed file @ length of stored 1 , compare, e.g. using hashes. suffices show signed document derived original one. there may have been other, intermediary revisions, though, might have cut off multiple incremental updates.
in general can find prior revision following /prev trailer entry of signed pdf cross reference table of prior revision , there move onwards document end marker %%eof because in incremental update
the added trailer shall contain entries except prev entry (if present) previous trailer, whether modified or not. in addition, added trailer dictionary shall contain prev entry giving location of previous cross-reference section (see table 15). each trailer shall terminated own end-of-file (%%eof) marker.
in case of pdfs using cross reference streams instead of cross reference tables, there analogous entry in cross-reference stream dictionary:
(present if file has more 1 cross-reference stream; not meaningful in hybrid-reference files; see 7.5.8.4, "compatibility applications not support compressed reference streams") byte offset in decoded stream beginning of file beginning of previous cross-reference stream. entry has same function prev entry in trailer dictionary (table 15).
you should aware, though, appended, incremental update revision can contain other changes in addition signature. thus, if previous revision corresponds stored document, still know signed document based on saved one.
if signature not applied in append mode, out of luck: programs manipulating pdfs (e.g. signing) might rearrange binary contents of document, possibly renumbering objects, changing compression, removing unused objects, etc., while appearance of document remains same.
Comments
Post a Comment