Discussion:
Announce: TeX Live on AWS Lambda
(too old to reply)
Sam O'Connor
2018-03-31 04:30:31 UTC
Permalink
Raw Message
Hi Tex-Live mailing list,

I’m not sure if this is the right place to post this.

My client has a web application that uses LaTeX to produce .PDF documents on-demand.
We’re using AWS Lambda https://aws.amazon.com/lambda <https://aws.amazon.com/lambda> to host the PDF generation code so that we can scale to an arbitrary number of simultaneous users without having to manage servers.

The main problem to overcome is that AWS Lambda is restricted to a 50MB .ZIP deployment archive, so a normal TeX Live installation won’t fit.

To solve this problem I’ve built a Dockerfile that:
- installs a minimal subset of TeX Live,
- installs a configurable list of required packages,
- deletes unneeded large files (e.g. doc pdfs) and
- creates a .ZIP archive suitable for deployment to AWS Lambda.

The result is an AWS Lambda function that takes a base64 encoded .ZIP file containing .tex files (and images etc) and returns a base64 encoded .PDF file.

The Dockerfile and AWS Lambda deployment scripts are available here: https://github.com/samoconnor/lambdalatex

I hope it is useful to someone!

Cheers,

Sam O'Connor

OC Technology Pty Ltd
Karl Berry
2018-04-01 23:28:18 UTC
Permalink
Raw Message
Subject: [tex-live] Announce: TeX Live on AWS Lambda

Congratulations.

- deletes unneeded large files (e.g. doc pdfs) and

Could just eliminate installing all doc and source files with the
respective profile options?
tlpdbopt_install_docfiles 0
tlpdbopt_install_srcfiles 0
(I believe.)

Happy computing,
Karl
Sam O'Connor
2018-04-02 03:17:01 UTC
Permalink
Raw Message
Thanks Karl,

I already have docfiles and srcfiles disabled (and tlpdbopt_create_formats=0)

I trim about 46M by removing these files:

8.1M /var/task/texlive/2017/bin/x86_64-linux/luajittex
7.9M /var/task/texlive/2017/bin/x86_64-linux/luatex
400K /var/task/texlive/2017/tlpkg/texlive.tlpdb
13M /var/task/texlive/2017/tlpkg/texlive.tlpdb.712d2ebb90babb07f11d145f4eea99aa
9.7M /var/task/texlive/2017/texmf-dist/source/latex/koma-script/doc
7.7M /var/task/texlive/2017/texmf-dist/doc

I guess the doc is being installed because of adding packages with tlmgr install after install-tl.

You cans see the details in the Dockerfile here:
https://github.com/samoconnor/lambdalatex/blob/master/Dockerfile
https://github.com/samoconnor/lambdalatex/blob/master/texlive.profile

Cheers,

Sam
Post by Karl Berry
Subject: [tex-live] Announce: TeX Live on AWS Lambda
Congratulations.
- deletes unneeded large files (e.g. doc pdfs) and
Could just eliminate installing all doc and source files with the
respective profile options?
tlpdbopt_install_docfiles 0
tlpdbopt_install_srcfiles 0
(I believe.)
Happy computing,
Karl
Norbert Preining
2018-04-03 00:13:35 UTC
Permalink
Raw Message
Hi Sam,

first of all, nice to see your work on Docker and AWS, very interesting!
Post by Sam O'Connor
8.1M /var/task/texlive/2017/bin/x86_64-linux/luajittex
7.9M /var/task/texlive/2017/bin/x86_64-linux/luatex
If you don't need that, you could also do
tlmgr remove --force luatex
to get rid of all of luatex, not only the binaries.
Post by Sam O'Connor
400K /var/task/texlive/2017/tlpkg/texlive.tlpdb
Deleting this file will freeze everything for tlmgr, means you will not
be able to change anything after this has been done. This is probably
fine for your application.
Post by Sam O'Connor
13M /var/task/texlive/2017/tlpkg/texlive.tlpdb.712d2ebb90babb07f11d145f4eea99aa
That can be deleted, it is the cached tlpdb from the tlmgr install run.
Post by Sam O'Connor
9.7M /var/task/texlive/2017/texmf-dist/source/latex/koma-script/doc
These files are included in the "runfiles" sections because the author
of koma-script wants it this way. Again, for your case I think it is ok
to remove them.
Post by Sam O'Connor
7.7M /var/task/texlive/2017/texmf-dist/doc
What else remains there?
Post by Sam O'Connor
I guess the doc is being installed because of adding packages with tlmgr install after install-tl.
No, if you have the options set during installation they will be
preserved in the tlpdb and further installs will also honor them.
If this doesn't work, it is a bug. Can you let me know what else remains
in texmf-dist/doc?

Thanks

Norbert

--
PREINING Norbert http://www.preining.info
Accelia Inc. + JAIST + TeX Live + Debian Developer
GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13
Sam O'Connor
2018-04-03 00:27:48 UTC
Permalink
Raw Message
Post by Norbert Preining
Post by Sam O'Connor
8.1M /var/task/texlive/2017/bin/x86_64-linux/luajittex
7.9M /var/task/texlive/2017/bin/x86_64-linux/luatex
If you don't need that, you could also do
tlmgr remove --force luatex
to get rid of all of luatex, not only the binaries.
That’s good to know. Thanks!
Post by Norbert Preining
Post by Sam O'Connor
7.7M /var/task/texlive/2017/texmf-dist/doc
What else remains there?
koma related docs...

bash-4.2# find /var/task/texlive/2017/texmf-dist/doc
/var/task/texlive/2017/texmf-dist/doc
/var/task/texlive/2017/texmf-dist/doc/latex
/var/task/texlive/2017/texmf-dist/doc/latex/koma-script
/var/task/texlive/2017/texmf-dist/doc/latex/koma-script/README
/var/task/texlive/2017/texmf-dist/doc/latex/koma-script/scraddr.html
/var/task/texlive/2017/texmf-dist/doc/latex/koma-script/manifest.txt
— snip —


It might be nice to have a `tlmgr` or `texlive.profile` “clean-all" option to create an absolute-minimum install (event if it is locked from further updates, that is fine for many embedded or server deployment scenarios).

Question: is it possible to put a list of specific packages to install in the `texlive.profile` file? i.e. specify a custom collection in-line?

Thanks for your help with this!
Cheers,

Sam
Uwe Ziegenhagen
2018-04-03 00:34:33 UTC
Permalink
Raw Message
​Please be aware, that some package authors might object against having the
documentation stripped away for the actual codse.

​This could be a possible mine field you don't want to step in...

​Uwe
--
Dr. Uwe Ziegenhagen
<http://www.uweziegenhagen.de>
Norbert Preining
2018-04-03 00:38:41 UTC
Permalink
Raw Message
Hi
​Please be aware, that some package authors might object against having the
documentation stripped away for the actual codse.
Yes, that is what I mentioned. But since it is run in a purely automated
environment without a user being able to ask for documentation, I guess
it is fine in this case.

Norbert

--
PREINING Norbert http://www.preining.info
Accelia Inc. + JAIST + TeX Live + Debian Developer
GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13
Norbert Preining
2018-04-03 00:37:37 UTC
Permalink
Raw Message
Hi Sam,
Post by Sam O'Connor
koma related docs...
Yes, I explained that in the previous email, this is on purpose.
Post by Sam O'Connor
It might be nice to have a `tlmgr` or `texlive.profile` “clean-all" option to create an absolute-minimum install (event if it is locked from further updates, that is fine for many embedded or server deployment scenarios).
What is "absolute minimum"? That is impossible to define. Only the
engines and one needs to build formats by hand? Which languages? Which
fonts? Which packages? There is no way to fix this once and for all,
thus we refrain from doing so.
Post by Sam O'Connor
Question: is it possible to put a list of specific packages to install in the `texlive.profile` file? i.e. specify a custom collection in-line?
No, that is not possible at the moment, sorry.

All the best

Norbert

--
PREINING Norbert http://www.preining.info
Accelia Inc. + JAIST + TeX Live + Debian Developer
GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13
Sam O'Connor
2018-04-03 03:17:04 UTC
Permalink
Raw Message
Post by Norbert Preining
If you don't need that, you could also do
tlmgr remove --force luatex
to get rid of all of luatex, not only the binaries.
This works if I also use `tlpdbopt_autobackup 0`.
Without that option removing luatex does not decrease the installation size.
Post by Norbert Preining
What is "absolute minimum"? That is impossible to define. Only the
engines and one needs to build formats by hand? Which languages? Which
fonts? Which packages? There is no way to fix this once and for all,
thus we refrain from doing so.
True. The minimum is not generally knowable.
Maybe tlmgr could have a simple clean option that just removes the cached tlpdb.
Post by Norbert Preining
Post by Uwe Ziegenhagen
​Please be aware, that some package authors might object against having the
documentation stripped away for the actual codse.
Yes, that is what I mentioned. But since it is run in a purely automated
environment without a user being able to ask for documentation, I guess
it is fine in this case.
I’m assuming that the packages in question are distributed under the terms of the LPPL.
There is no intention of redistributing any of the packages, so my interpretation is that the only applicable clauses are 1 and 5:

“1. Activities other than distribution and/or modification of the Work are not covered by this license; they are outside its scope. In particular, the act of running the Work is not restricted and no requirements are made concerning any offers of support for the Work.”

“5. If you are not the Current Maintainer of the Work, you may modify your copy of the Work, thus creating a Derived Work based on the Work, and compile this Derived Work, thus creating a Compiled Work based on the Derived Work."

So we could think of a stripped-down installation as a modification, or as a compiled work ("processed into a form where it is directly usable on a computer system”). In either case, modification, compiling, running and offers of support are all unrestricted in the absence of distribution.

Uwe, if you believe that creating a stripped down installation is something that package authors might view as being outside the terms of the licence, I would like to understand why and try to do things in a way that complies with the license.

Regards,

Sam
Uwe Ziegenhagen
2018-04-03 06:35:00 UTC
Permalink
Raw Message
Hi Sam,

if it is just for you, there surely is no issue. But if you distribute the
files in any way, that might create some trouble. It's a pretty long story,
which I am not really eager to tell.

Uwe
Post by Norbert Preining
If you don't need that, you could also do
tlmgr remove --force luatex
to get rid of all of luatex, not only the binaries.
This works if I also use `tlpdbopt_autobackup 0`.
Without that option removing luatex does not decrease the installation size.
What is "absolute minimum"? That is impossible to define. Only the
engines and one needs to build formats by hand? Which languages? Which
fonts? Which packages? There is no way to fix this once and for all,
thus we refrain from doing so.
True. The minimum is not generally knowable.
Maybe tlmgr could have a simple clean option that just removes the cached tlpdb.
​Please be aware, that some package authors might object against having the
documentation stripped away for the actual codse.
Yes, that is what I mentioned. But since it is run in a purely automated
environment without a user being able to ask for documentation, I guess
it is fine in this case.
I’m assuming that the packages in question are distributed under the terms
of the LPPL.
There is no intention of redistributing any of the packages, so my
*“1. Activities other than distribution and/or modification of the Work
are not covered by this license; they are outside its scope. In particular,
the act of running the Work is not restricted and no requirements are
made concerning any offers of support for the Work.”*
*“5. If you are not the Current Maintainer of the Work, you may modify
your copy of the Work, thus creating a Derived Work based on the Work,
and compile this Derived Work, thus creating a Compiled Work based on
the Derived Work."*
So we could think of a stripped-down installation as a modification, or as
a compiled work ("processed into a form where it is directly usable on a
computer system”). In either case, modification, compiling, running
and offers of support are all unrestricted in the absence of distribution.
Uwe, if you believe that creating a stripped down installation is
something that package authors might view as being outside the terms of the
licence, I would like to understand why and try to do things in a way that
complies with the license.
Regards,
Sam
--
Dr. Uwe Ziegenhagen
<http://www.uweziegenhagen.de>
Karl Berry
2018-04-03 22:31:43 UTC
Permalink
Raw Message
The minimum is not generally knowable.

True, but there is an "infra" scheme which is intended to install the
minimum needed to do anything useful. It's used by the l3build setup,
for instance. It does include luatex, though. -k
Sam O'Connor
2018-04-03 23:39:15 UTC
Permalink
Raw Message
Post by Sam O'Connor
The minimum is not generally knowable.
There is an "infra" scheme which is intended to install the minimum
needed to do anything. It's used by the l3build setup, for instance.
It does include luatex, though. -k
Thanks Karl,

Another idea I’m considering is to use `strace -f -t -e trace=file` to get a list of files that are accessed during a test run and then delete everything else before creating the deployment archive. In this application the input always comes from the same template so the set of accessed files should be fairly stable.
Norbert Preining
2018-04-04 00:35:11 UTC
Permalink
Raw Message
Post by Sam O'Connor
Another idea I’m considering is to use `strace -f -t -e trace=file`
Or
pdflatex -recorder ..
and look into the produced .fls file. But it will not list all the
files, only those loaded via kpathsea I guess.

Norbert

--
PREINING Norbert http://www.preining.info
Accelia Inc. + JAIST + TeX Live + Debian Developer
GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13
Loading...