Commit Graph

1440 Commits

Author SHA1 Message Date
Owen Cliffe
c6abc8bf64 Use context logging more to ensure context vars are present in log lines (#1039) 2018-06-06 15:14:29 +01:00
CI
19bc5cf852 fnserver: 0.3.463 release [skip ci] 2018-06-05 21:48:33 +00:00
Tolga Ceylan
4af53025d8 fn: lb-agent: Initial TryCall result can be retriable. (#1035)
Before this change, we assumed data may end up in a container
once we placed a TryCall() and if gRPC send failed, we did not
retry. However, a send failure cannot result in data in a
container, since only upon successful receipt of a TryCall can
pure-runner schedule a call into a container. Here we trust
gRPC and if gRPC layer says it could not send a msg, then
the receiver did not receive it.
2018-06-05 14:41:13 -07:00
CI
64431b4497 fnserver: 0.3.462 release [skip ci] 2018-06-05 21:31:15 +00:00
Andrea Rosa
c2c295ffb3 Add a LBAgent constructor which accept AgentConfig (#1037)
In some cases could be useful to pass Agent configurations to the
LnAgent constuctor, this small change adds a new constructor which
accepts an agent configuration as additional parameter.
2018-06-05 13:59:43 -07:00
Tolga Ceylan
1cd5894f41 fn: LB agent: reduce 'Too Busy' error logs (#1033)
With this PR, runner client translates too busy errors
from gRPC session and runner itself into Fn error type.
Placers now ignore this error message to reduce unnecessary
logging.
2018-06-04 12:16:00 -07:00
Tolga Ceylan
7261ddedcc fn: LB agent: EOF from runner is normal in nack cases (#1032) 2018-06-04 12:10:00 -07:00
Reed Allman
00c29b8bf3 datastore no longer implements logstore (#1013)
* datastore no longer implements logstore

the underlying implementation of our sql store implements both the datastore
and the logstore interface, however going forward we are likely to encounter
datastore implementers that would mock out the logstore interface and not use
its methods - signalling a poor interface. this remedies that, now they are 2
completely separate things, which our sqlstore happens to implement both of.

related to some recent changes around wrapping, this keeps the imposed metrics
and validation wrapping of a servers logstore and datastore, just moving it
into New instead of in the opts - this is so that a user can have the
underlying datastore in order to set the logstore to it, since wrapping it in
a validator/metrics would render it no longer a logstore implementer (i.e.
validate datastore doesn't implement the logstore interface), we need to do
this after setting the logstore to the datastore if one wasn't provided
explicitly.

* splits logstore and datastore metrics & validation logic
* `make test` should be `make full-test` always. got rid of that so that
nobody else has to wait for CI to blow up on them after the tests pass locally
ever again.

* fix new tests
2018-06-04 00:08:16 -07:00
Tomas Knappek
e0425abd19 unmatched api handler reported as 'invalid' (#1028) 2018-06-01 13:49:40 -07:00
Tolga Ceylan
a57907eed0 fn: user friendly timeout handling changes (#1021)
* fn: user friendly timeout handling changes

Timeout setting in routes now means "maximum amount
of time a function can run in a container".

Total wait time for a given http request is now expected
to be handled by the client. As long as the client waits,
the LB, runner or agents will search for resources to
schedule it.
2018-06-01 13:18:13 -07:00
CI
ffefcf5773 fnserver: 0.3.461 release [skip ci] 2018-06-01 10:49:07 -07:00
Tolga Ceylan
f97b63f878 fn: fixup temp dir read/write permissions if tmp fs size is not set. (#1024)
When TmpFsSize is not set in a route, docker fails to create a /tmp
mount that is writable. Forcing docker to explicitly to this if
read-only root directory is enabled (default).
2018-06-01 10:49:07 -07:00
CI
316940285d fnserver: 0.3.460 release [skip ci] 2018-05-31 20:11:30 +00:00
Tolga Ceylan
e1b7e30e49 fn: cleanup of unused/global constants in lb agent (#1020)
Moved retry interval as placer member variable for time-being.
2018-05-31 13:04:06 -07:00
CI
a282005bf7 fnserver: 0.3.459 release [skip ci] 2018-05-31 01:24:47 +00:00
Tolga Ceylan
d190167580 fn: read-only root fs becomes default (#1019)
* fn: read-only root fs becomes default

Set root fs as read-only by default.

* fn: update doc for FN_DISABLE_READONLY_ROOTFS
2018-05-30 18:17:28 -07:00
CI
f4712b4f5b fnserver: 0.3.458 release [skip ci] 2018-05-29 23:33:43 +00:00
Tolga Ceylan
7f1d14d21f fn: slot hash id must be utf8 in gRPC (#1016) 2018-05-29 16:26:43 -07:00
CI
47c6fb54af fnserver: 0.3.457 release [skip ci] 2018-05-25 21:20:22 +00:00
Tolga Ceylan
74a5379dec fn: lb & pure-runner slot hash id communication (#1007)
* fn: lb & pure-runner slot hash id communication

With this change, LB can pre-calculate the slot hash
key and pass it to runners. If LB knows/calculates
the slot hash ids, then it can also make better
estimates on which runner can successfully execute
it especially when status messages from runner
include a small summary of idle slots for a given
slot hash id. (TODO)

* fn: fix mock test
2018-05-25 14:12:48 -07:00
Tolga Ceylan
9584643142 fn: size restricted tmpfs /tmp and read-only / support (#1012)
* fn: size restricted tmpfs /tmp and read-only / support

*) read-only Root Fs Support
*) removed CPUShares from docker API. This was unused.
*) docker.Prepare() refactoring
*) added docker.configureTmpFs() for size limited tmpfs on /tmp
*) tmpfs size support in routes and resource tracker
*) fix fn-test-utils to handle sparse files better in create file

* test typo fix
2018-05-25 14:12:29 -07:00
CI
71dbf9fa57 fnserver: 0.3.456 release [skip ci] 2018-05-25 15:39:27 +00:00
Tomas Knappek
e3e264de53 Api metrics (#1014)
* api metrics support

* comments reflected

* metrics middleware fix
2018-05-25 08:31:37 -07:00
CI
05caef0d26 fnserver: 0.3.455 release [skip ci] 2018-05-21 17:50:03 +00:00
Tomas Knappek
8aae1502f0 Admin server for paths which are not part of API (#1011)
* admin server added

* test fixed, ping moved out of admin server

* keeping admin/web port in sync
2018-05-21 10:41:27 -07:00
CI
4623bf6a59 fnserver: 0.3.454 release [skip ci] 2018-05-21 10:35:43 +00:00
Gerardo Viedma
ea1f94253f Implement graceful shutdown of agent.DataAccess (#1008)
* Implements graceful shutdown of agent.DataAccess and underlying Datastore/Logstore/MessageQueue

* adds tests for closing agent.DataAccess and Datastore
2018-05-21 11:28:21 +01:00
CI
105d9b8f1d fnserver: 0.3.453 release [skip ci] 2018-05-19 00:07:07 +00:00
CI
60a642092e fnserver: 0.3.452 release [skip ci] 2018-05-18 21:48:39 +00:00
CI
3198a14410 fnserver: 0.3.451 release [skip ci] 2018-05-17 23:30:35 +00:00
CI
8518bea0e7 fnserver: 0.3.450 release [skip ci] 2018-05-17 22:20:02 +00:00
Tolga Ceylan
77086ecc24 fn: lb-agent & runner gRPC updates (#1005)
Breaking changes:

*) Removed unused ACK/NACK definitions
*) Extended Finished messages with error code/str
2018-05-17 15:02:15 -07:00
Tolga Ceylan
7cf8e2a61d fn: pure-runner time out while waiting TryCall (#1006)
This should return a retriable error code 503.
2018-05-17 15:00:50 -07:00
CI
8c2c6286dc fnserver: 0.3.449 release [skip ci] 2018-05-17 20:30:29 +00:00
Tomas Knappek
ccde0d2357 Wrap custom datastore with metrics and validator (#1002)
* Wrap method added to datastore

* datastore formatting fixed
2018-05-17 13:21:36 -07:00
CI
3508378b8f fnserver: 0.3.448 release [skip ci] 2018-05-17 19:17:28 +00:00
Tolga Ceylan
4ccde8897e fn: lb and pure-runner with non-blocking agent (#989)
* fn: lb and pure-runner with non-blocking agent

*) Removed pure-runner capacity tracking code. This did
not play well with internal agent resource tracker.
*) In LB and runner gRPC comm, removed ACK. Now,
upon TryCall, pure-runner quickly proceeds to call
Submit. This is good since at this stage pure-runner
already has all relevant data to initiate the call.
*) Unless pure-runner emits a NACK, LB immediately
streams http body to runners.
*) For retriable requests added a CachedReader for
http.Request Body.
*) Idempotenty/retry is similar to previous code.
After initial success in Engament, after attempting
a TryCall, unless we receive NACK, we cannot retry
that call.
*) ch and naive places now wraps each TryExec with
a cancellable context to clean up gRPC contexts
quicker.

* fn: err for simpler one-time read GetBody approach

This allows for a more flexible approach since we let
users to define GetBody() to allow repetitive http body
read. In default LB case, LB executes a one-time io.ReadAll
and sets of GetBody, which is detected by RunnerCall.RequestBody().

* fn: additional check for non-nil req.body

* fn: attempt to override IO errors with ctx for TryExec

* fn: system-tests log dest

* fn: LB: EOF send handling

* fn: logging for partial IO

* fn: use buffer pool for IO storage in lb agent

* fn: pure runner should use chunks for data msgs

* fn: required config validations and pass APIErrors

* fn: additional tests and gRPC proto simplification

*) remove ACK/NACK messages as Finish message type works
OK for this purpose.
*) return resp in api tests for check for status code
*) empty body json test in api tests for lb & pure-runner

* fn: buffer adjustments

*) setRequestBody result handling correction
*) switch to bytes.Reader for read-only safety
*) io.EOF can be returned for non-nil Body in request.

* fn: clarify detection of 503 / Server Too Busy
2018-05-17 12:09:03 -07:00
CI
1083623045 fnserver: 0.3.447 release [skip ci] 2018-05-17 12:15:30 +00:00
Mark Godfrey
ac4e6c5a03 Replace panic on enqueue for LB agent with error. (#1004) 2018-05-17 13:05:47 +01:00
CI
0c022558a8 fnserver: 0.3.446 release [skip ci] 2018-05-16 21:35:13 +00:00
Tomas Knappek
f6d47fd0ed add DELETE to allowed cors methods (#1001) 2018-05-16 14:25:02 -07:00
CI
c7aaf732fe fnserver: 0.3.445 release [skip ci] 2018-05-16 18:55:35 +00:00
Tolga Ceylan
eab85dfab0 fn: agent MaxRequestSize limit (#998)
* fn: agent MaxRequestSize limit

Currently, LimitRequestBody() exists to install a
http request body size in http/gin server. For production
enviroments, this is expected to be used. However, in agents
we may need to verify/enforce these size limits and to be
able to assert in case of missing limits is valuable.
With this change, operators can define an agent env variable
to limit this in addition to installing Gin/Http handler.

http.MaxBytesReader is superior in some cases as it sets
http headers (Connection: close) to guard against subsequent
requests.

However, NewClampReadCloser() is superior in other cases,
where it can cleanly return an API error for this case alone
(http.MaxBytesReader() does not return a clean error type
for overflow case, which makes it difficult to use it without
peeking into its implementation.)

For lb agent, upcoming changes rely on such limits enabled
and using gin/http handler (http.MaxBytesReader) makes such
checks/safety validations difficult.

* fn: read/write clamp code adjustment

In case of overflows, opt for simple implementation
of a partial write followed by return error.
2018-05-16 11:45:57 -07:00
CI
7a64ec9db5 fnserver: 0.3.444 release [skip ci] 2018-05-15 18:10:16 +00:00
Reed Allman
cbe0d5e9ac add user syslog writers to app (#970)
* add user syslog writers to app

users may specify a syslog url[s] on apps now and all functions under that app
will spew their logs out to it. the docs have more information around details
there, please review those (swagger and operating/logging.md), tried to
implement to spec in some parts and improve others, open to feedback on
format though, lots of liberty there.

design decision wise, I am looking to the future and ignoring cold containers.
the overhead of the connections there will not be worth it, so this feature
only works for hot functions, since we're killing cold anyway (even if a user
can just straight up exit a hot container).

syslog connections will be opened against a container when it starts up, and
then the call id that is logged gets swapped out for each call that goes
through the container, this cuts down on the cost of opening/closing
connections significantly. there are buffers to accumulate logs until we get a
`\n` to actually write a syslog line, and a buffer to save some bytes when
we're writing the syslog formatting as well. underneath writers re-use the
line writer in certain scenarios (swapper). we could likely improve the ease
of setting this up, but opening the syslog conns against a container seems
worth it, and is a different path than the other func loggers that we create
when we make a call object. the Close() stuff is a little tricky, not sure how
to make it easier and have the ^ benefits, open to idears.

this does add another vector of 'limits' to consider for more strict service
operators. one being how many syslog urls can a user add to an app (infinite,
atm) and the other being on the order of number of containers per host we
could run out of connections in certain scenarios. there may be some utility
in having multiple syslog sinks to send to, it could help with debugging at
times to send to another destination or if a user is a client w/ someone and
both want the function logs, e.g. (have used this for that in the past,
specifically).

this also doesn't work behind a proxy, which is something i'm open to fixing,
but afaict will require a 3rd party dependency (we can pretty much steal what
docker does). this is mostly of utility for those of us that work behind a
proxy all the time, not really for end users.

there are some unit tests. integration tests for this don't sound very fun to
maintain. I did test against papertrail with each protocol and it works (and
even times out if you're behind a proxy!).

closes #337

* add trace to syslog dial
2018-05-15 11:00:26 -07:00
CI
e269f965a4 fnserver: 0.3.443 release [skip ci] 2018-05-15 17:13:40 +00:00
Tomas Knappek
19f09b3a6c Added FN_API_CORS_HEADERS for configuring CORS headers (#997) 2018-05-15 18:03:01 +01:00
CI
37354838b4 fnserver: 0.3.442 release [skip ci] 2018-05-11 21:44:50 +00:00
Tolga Ceylan
cba3fc14e7 fn: rename vars for clarity (#992) 2018-05-11 14:35:13 -07:00
CI
1c2740819b fnserver: 0.3.441 release [skip ci] 2018-05-10 15:17:45 +00:00