Discussion:
Please default \pdftrailerid{} to SOURCE_DATE_EPOCH if exported
(too old to reply)
Chris Lamb
2017-06-27 17:46:26 UTC
Permalink
Raw Message
Hi tex-live,

Please consider defaulting the \pdftrailerid macro to the value of
SOURCE_DATE_EPOCH if this is exported in the environment.

Whilst we can get a reproducible PDF output in some situations, the
ordering of the elements in these IDs varies. For example:

-/ID [<A979E943C51DC1AAE48AAFF27BBA5A0D> <EA2B7DBF37C8E3DC2AA0AFDAA1FF4E0C>]
+/ID [<EA2B7DBF37C8E3DC2AA0AFDAA1FF4E0C> <A979E943C51DC1AAE48AAFF27BBA5A0D>]

This varies depending on the *path* from which you are building,
presumably due to the default pdftrailerid being seeded from the
full path name.

Using SOURCE_DATE_EPOCH would make sense here as it the old behaviour
is unchanged. Another alternative could be to set the default to the
empty string if SOURCE_DATE_EPOCH is exported.

I believe this is the code that actually prints these tokens:

http://sources.debian.net/src/texlive-bin/2017.20170613.44572-
2/texk/web2c/pdftexdir/pdftex.web/#L20126-L20129

(I would provide a patch but my WEB is very very rusty...)


Best wishes,
--
,''`.
: :' : Chris Lamb, Debian Project Leader
`. `'` ***@debian.org / chris-lamb.co.uk
`-
Karl Berry
2017-06-28 21:51:46 UTC
Permalink
Raw Message
-/ID [<A979E943C51DC1AAE48AAFF27BBA5A0D> <EA2B7DBF37C8E3DC2AA0AFDAA1FF4E0C>]
+/ID [<EA2B7DBF37C8E3DC2AA0AFDAA1FF4E0C> <A979E943C51DC1AAE48AAFF27BBA5A0D>]

This varies depending on the *path* from which you are building,

Ack, thanks for the report.

I'll send a patch after I have a chance to construct one, if no one else
(Akira?) gets there first. --thanks, karl.
Chris Lamb
2017-07-05 07:40:01 UTC
Permalink
Raw Message
Hi Karl,
Post by Chris Lamb
-/ID [<A979E943C51DC1AAE48AAFF27BBA5A0D> <EA2B7DBF37C8E3DC2AA0AFDAA1FF4E0C>]
+/ID [<EA2B7DBF37C8E3DC2AA0AFDAA1FF4E0C> <A979E943C51DC1AAE48AAFF27BBA5A0D>]
This varies depending on the *path* from which you are building,
Ack, thanks for the report.
I'll send a patch after I have a chance to construct one, if no one else
(Akira?) gets there first.
Just a friendly ping on this; we have a whole bunch of packages we'd love to be
reproducible within Debian and this is the only remaining blocker for them :)


Best wishes,
--
,''`.
: :' : Chris Lamb, Debian Project Leader
`. `'` ***@debian.org / chris-lamb.co.uk
`-
Karl Berry
2017-07-05 21:40:32 UTC
Permalink
Raw Message
Just a friendly ping on this; we have a whole bunch of packages we'd
love to be reproducible within Debian and this is the only remaining
blocker for them :)

Maybe someone in Debian could come up with a patch, since it is
important to you. I doubt I will be able to look at it before August,
at best. Sorry. -k
Chris Lamb
2017-07-05 21:51:53 UTC
Permalink
Raw Message
Dear Karl,
Post by Karl Berry
Post by Chris Lamb
Just a friendly ping on this; we have a whole bunch of packages we'd
love to be reproducible within Debian and this is the only remaining
blocker for them :)
Maybe someone in Debian could come up with a patch, since it is
important to you
No, my apologies; I inferred from your previous mail that a patch would be
quick and easy for you hence why I didn't immediately jump in.

Am very happy to get my hands dirty in the code! (Although any quick
implementation hints before I do that…?)


Best wishes,
--
,''`.
: :' : Chris Lamb, Debian Project Leader
`. `'` ***@debian.org / chris-lamb.co.uk
`-
Karl Berry
2017-09-02 22:00:26 UTC
Permalink
Raw Message
Chris - Anders sent the patch below to eliminate the cwd from the PDF
ID. Looked plausible to me. Maybe you could give it a try and see if it
does the job for you? --thanks, karl.


Date: Sat, 2 Sep 2017 16:45:50 -0400 (EDT)
From: Anders Kaseorg <***@mit.edu>
To: Karl Berry <***@freefriends.org>
cc: ***@tug.org
Subject: Re: [pdftex] Consider removing dependence of PDF ID field on current
directory name
With that in mind, could printID be changed to avoid depending on the
current directory name, either by default
I think we can change it by default. Patches welcome.
Alright. How does this look?

Anders


diff --git a/source/src/texk/web2c/pdftexdir/ChangeLog b/source/src/texk/web2c/pdftexdir/ChangeLog
index 116541e8..a5ebe6ea 100644
--- a/source/src/texk/web2c/pdftexdir/ChangeLog
+++ b/source/src/texk/web2c/pdftexdir/ChangeLog
@@ -1,3 +1,9 @@
+2017-09-02 Anders Kaseorg <***@mit.edu>
+
+ * utils.c (printID): Do not hash the current directory name into
+ the PDF ID field, since any randomness in it would lead to
+ non-reproducible builds.
+
2017-03-16 Pali Roh\'ar <***@gmail.com>

Allow .enc files for bitmap fonts, following thread at
diff --git a/source/src/texk/web2c/pdftexdir/utils.c b/source/src/texk/web2c/pdftexdir/utils.c
index 67ff8e9d..fda97666 100644
--- a/source/src/texk/web2c/pdftexdir/utils.c
+++ b/source/src/texk/web2c/pdftexdir/utils.c
@@ -697,9 +697,10 @@ void unescapehex(poolpointer in)
</blockquote>
This stipulates only that the two IDs must be identical when the file is
created and that they should be reasonably unique. Since it's difficult
- to get the file size at this point in the execution of pdfTeX and
- scanning the info dict is also difficult, we start with a simpler
- implementation using just the first two items.
+ to get the file size at this point in the execution of pdfTeX, scanning
+ the info dict is also difficult, and any randomness in the current
+ directory name would lead to non-reproducible builds, we start with a
+ simpler implementation using just the current time and the file name.
*/
void printID(strnumber filename)
{
@@ -707,29 +708,13 @@ void printID(strnumber filename)
md5_byte_t digest[16];
char id[64];
char *file_name;
- char pwd[4096];
/* start md5 */
md5_init(&state);
/* get the time */
initstarttime();
md5_append(&state, (const md5_byte_t *) start_time_str, strlen(start_time_str));
/* get the file name */
- if (getcwd(pwd, sizeof(pwd)) == NULL)
- pdftex_fail("getcwd() failed (%s), path too long?", strerror(errno));
-#ifdef WIN32
- {
- char *p;
- for (p = pwd; *p; p++) {
- if (*p == '\\')
- *p = '/';
- else if (IS_KANJI(p))
- p++;
- }
- }
-#endif
file_name = makecstring(filename);
- md5_append(&state, (const md5_byte_t *) pwd, strlen(pwd));
- md5_append(&state, (const md5_byte_t *) "/", 1);
md5_append(&state, (const md5_byte_t *) file_name, strlen(file_name));
/* finish md5 */
md5_finish(&state, digest);
Chris Lamb
2017-09-03 08:40:06 UTC
Permalink
Raw Message
Hi Karl & Anders,
Post by Karl Berry
Chris - Anders sent the patch below to eliminate the cwd from the PDF
ID. Looked plausible to me. Maybe you could give it a try and see if it
does the job for you?
It absolutely does; thank you :) Would love to see this in the next
release, so please commit to your VCS.

Just FYI I've filed this in Debian as:

https://bugs.debian.org/874102


Best wishes,
--
,''`.
: :' : Chris Lamb
`. `'` ***@debian.org / chris-lamb.co.uk
`-
Norbert Preining
2017-09-03 11:23:46 UTC
Permalink
Raw Message
I'll include the patch in a build of texlive-bin soonish as the next release of TeX Live is still far off.

Norbert
Post by Chris Lamb
Hi Karl & Anders,
Post by Karl Berry
Chris - Anders sent the patch below to eliminate the cwd from the PDF
ID. Looked plausible to me. Maybe you could give it a try and see if
it
Post by Karl Berry
does the job for you?
It absolutely does; thank you :) Would love to see this in the next
release, so please commit to your VCS.
https://bugs.debian.org/874102
Best wishes,
--
PREINING Norbert + TeX Live & Debian Developer + http://www.preining.info
GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13
Chris Lamb
2017-09-03 19:03:39 UTC
Permalink
Raw Message
Hi Norbert,
Post by Norbert Preining
I'll include the patch in a build of texlive-bin soonish as the
next release of TeX Live is still far off.
Thank you. :) Please close #874102 when you do so I can reschedule
rebuilds on tests.reproducible-builds.org.

Need to improve isdebianreproducibleyet.com and packages in the
toolchain are very valuable/productive to get fixed


Regards,
--
,''`.
: :' : Chris Lamb
`. `'` ***@debian.org / chris-lamb.co.uk
`-
Karl Berry
2017-11-14 23:51:47 UTC
Permalink
Raw Message
FYI, I've committed Anders' change to avoid using getcwd in the /ID
computation in the pdftex (r782) and TeX Live (r45808) repositories.
Thanks! -k

Loading...