האזינו: מדוע Taboola עברו מ- Oracle Java ל- Zing
לאחרונה צפיתי בוובינר מעניין של חברת Azul, אותה אנו מייצגים, בשיתוף עם VP IT מחברת Taboola .
הוובינר יעניין מי שעוסק בסביבות production שמריצות Java ו- JVM .
בוובינר שוחחו Simon Riter – סגן ה CTO מחברת AZUL , ואריאל פיזצקי מחברת Taboola.
אריאל סיפר מדוע הם משתמשים ב- Zing לשיפור ביצועי Java / JVM (כולל שקפים של ביצועים לפני ואחרי); המליץ על המוצר והסביר מדוע עברו מתמיכת Java של Oracle .
כמה נקודות מעניינות שהוזכרו במהלך השיחה:
- Taboola עוסקת בהרבה big data:
- 3.2 מליארד דפי אינטרנט ביום
- 1.4 מליארד משתמשים (שונים) בחודש
- מספקים 30 מליארד המלצות יומיות
- מחזיקים 8500 שרתים ברחבי העולם
- 30 מליארד שורות לוגים מדי יום
- דקה 15: ביצועי CPU לפני שימוש ב ZING ולאחריו
(תמונה בהמשך)
רואים ש- Zing שיפר משמעותית את הביצועים - דקה 25: קל לעשות מיגרציה לזה
- אריאל ציין שהם עובדים עם Zing כבר 3 שנים
- לפני Zing השתמשו ב- G1 (סוג של Garbage Collector) . זה היה לפני 4 שנים
- בסוף הוובינר ציין שיש משהו שהם יכולים לעשות רק עם Zing
- הם משתמשים ב- Grafana, Prometheus, Cassandra, Kafka, pull data with local agents
לנוחיותכם הוספנו את התמליל המלא.
לצפייה מדוע עברו ל- Zing :
אנו חברת ALM-Toolbox מייצגים את Azul בישראל ובעולם ומספקים מספקים יעוץ JAVA , סיוע במעבר ורשיונות למוצרי החברה ZING ו ZULU (תמיכת JAVA), הטמעה, הדרכה ומכירת רשיונות Enterprise.
לפרטים נוספים אפשר לפנות אלינו: azul@almtoolbox.com או טלפונית 072-240-5222
לפרטים נוספים אפשר לפנות אלינו: azul@almtoolbox.com או טלפונית 072-240-5222
Transcription:
Good morning, good afternoon or good evening depending on where you are and welcome to this webinar.
Today we're going to be talking about why Taboola switched from Oracle to Azul Java
…
So my name is Simon Ritter i work as the Deputy CTO of Azul systems
and I'm joined today by Ariel Pisetzky who is the VP IT at Taboola
I'll let Ariel introduce himself and start talking about Taboola
[Ariel] Thank you very much Simon
I'm Ariel i'm with Taboola and I would like to start with talking a bit about Taboola
and then talk a lot about what we do with with Azul
specifically with Zing and why it was
why it was helpful for us
so just a bit about Taboola: Taboola is a content discovery platform
we are that company that helps you find content that you may like and you never knew is
out there
when you browse or surf the web and you
are at the bottom of an article or maybe
midsection maybe on the right rail and
you see those
boxes of content that you may like
that is us and we really help people
discover
interesting and new things at the moment
when they are
engaged what we like to call those
moments of
moments of next when you are leaning in
when you are looking at your screen you
are consuming content and you wish to
read the next
article the next interesting thing the
next thing that is relevant
for you and I've said for you
twice now and the main driver for that
is really
the personalization every recommendation
that we provide
is personalized that means that if Simon
and i were to browse the same page
of the same publisher we would not
actually be seeing the same
recommendations the recommendations
for Simon the recommendations for me
would be different
now we do this on multiple digital
properties
places that you've seen online places
that you have seen
and to the extent that we now provide
anywhere between
two to three to four billion web pages a
day depending on the day depending on
the news
um obviously in the time of covid for
example we
saw a huge surge over the first month
during march and a bit into april
where every every one of us was online
consuming content
looking for more news trying to
understand what
is happening around us how the world has
changed from then that number has now
declined back to
normal I'd say browsing kind of
activities
but still it is really interesting to
see how those numbers
ebb and flow and how we provide that
moment of next to every one of you
generally speaking about numbers we see
in our discovery
1.4 billion unique users every month we
see
about 1.5 where we actually serve 1.5
billion clicks which means that
beyond those billions of recommendations
that we provide
every day people also of course click
on those recommendations and we need to
now serve
that recommendation and provide all the
back end
of that click which as a atom
kind of operation would seem very benign
but at
scale is actually quite a big challenge
um in terms of kind of compute
you would see that on average 3.2
billion
a day 30 billion recommendations
8500 physical servers
worldwide we reach 1.4
million queries per second on our system
at
uh aggregate fully aggregated load so if
we look at the
amount of just requests coming in at
peak
that is a interesting number and of
course the lines of logs
for every recommendation we will have a
line of log for every server we will
have metrics we will have
a lot of data that we need to pull in
and we will have
a lot of different events that have
different significance
for different people within the
organization
and that's big data isn't it
that is that is big data in terms of um
variety in terms of uh volume
and in terms of um i'd say the velocity
of it coming in
and if you are kind of looking at what
we did and this is the immediate first
graph
to immediately understand how zing
impacted us
so maybe before i talk about this graph
for a second i'll just say that we are a
java shop
uh our application is java based our
Cassandra database
is java based we have other applications
that are
java based and we have multiple
different companies that
have merged into tabula with the years
that are also java based
so we see a whole lot of impact
every time we can optimize anything
on the specific java platform
and here this is like a plain vanilla
Cassandra
this is a community version Cassandra
real graph
real data from taboola from our clusters
and this is like the first view where
you can see that point in time
where we had a
enough of zing installed on our
cassandra servers
where we moved from a higher level of
the latency
to a lower level of latency in all
of the measurements not only in the p50
but as you can see here in the red line
the p99
so it's really amazing to see how we
could
improve the
runtime of the application by actually
touching the Cassandra
so the Cassandra now on really hundreds
of servers
is not served with the plain old jvm
but with the azul zing jvm saving us
on latency and allowing us to to better
serve our clients
and uh a wonderful thing yeah so as i
i i won't stop you so i'll jump in after
you finish that bit
sorry okay a wonderful thing that you
can see here is the flattening of the
averages and even of the p95 i mean
the p99 is still very noisy the red line
here you can see is
still very noisy but the blue line the
p95
you can see is like totally flat so 95
of the requests coming into this
cassandra database
have actually been totally flattened and
this of course
is due to the job of the i'm sorry the
garbage collection
being handled properly via the azul
software and
yes simon you were you were what you
wanted to say yeah so
i was just going to add a little bit of
detail there because clearly um as you
say
what's happening there is that the
garbage collection which was interfering
with what was happening from the
cassandra cluster
and actually being able to deliver the
results that you're looking for what we
do with zing
is to essentially eliminate the garbage
collection pauses by doing it
concurrently with the application
threads
and that that's a really big difference
because we're running the garbage
collection
simultaneously with the application so
you're able to handle
the the queries you're able to return
the results
whilst we're doing the garbage
collection in the background and
the way that we do that is by using a
read barrier
so that in terms of um every time you
access an object
what we can do is we can ensure that you
can do that safely so if we're doing
marking
we always make sure we mark the object
before we give it to you if we're
actually moving objects around within
the heap which we actually
we do again concurrently with the
application
we do it totally safely so you can make
any changes to those objects and then
when you we give them to you to use
then you can make any changes completely
safely so that's really the big thing
there it's all 100
safe and it gives you that uh
elimination of the latency in the way
that you're seeing in that particular
graph
so i would even use another maybe very
obvious to you simon
but for us when i first heard of azul
was not obvious
drop replacement it was as easy as that
and
we just we didn't have to think about
this we didn't have to do anything in
our code we didn't have to do anything
in our application
and this is cassandra we didn't
obviously touch cassandra we just
replaced the jvm
and it worked yeah that is a nice point
actually the fact that
you can just drop in the jvm you don't
have to change any of your codes there's
no recompilation no recoding to make
take advantage of these features and
even to the point where you don't have
to change any of your startup scripts
you don't have to change any of the
parameters that you set
um we make it really really easy from
the the tuning point of view because
if you wanted to go in and actually do
any tuning essentially what you start
with is just changing the size of the
heap and that's that's really all you
have to do so all those
command line flags that you would
typically use with other jvms
those are not required for zulu of photo
for zinc
and so you can you can use the same
startup scripts because any of the ones
that we don't
support we just ignore so it doesn't
cause any problems
yes and i'll go into the next graph and
this
is the metrics as seen from a single
cassandra node
it's the same kind of graph that you saw
you saw earlier but this time
if the former graph this is from the
kind of application view and that's why
the whole
top is smudged there because it's an
internal application if we go to this
um internal graph you see that this is
the cassandra metrics
and i just smudged the the name of the
data set but you see here that it
totally flattened out
so if we had up to 1.5 seconds
when we're talking about milliseconds
the need for milliseconds of operations
on the uh 99 percentile we now see a
total kind of flattening of that line
and this is on the cassandra side as
well so it's the application sees
a healthier status and the cassandra
itself sees a healthier status
and this wonderful graph or multiple
graphs
this is from the zing uh proof of
concept that we ran
in uh taboola when we started out with
the
um with with the zulu this is again a
few years old
we're going to get newer graphs in a
moment but uh i wanted to bring you
to the kind of beginning of where we
where we started and you see the
the nice trend from left to right you
see that
drop line from left to right to right so
if i
specifically um direct you to graph
number four
on the bottom left side you see on graph
number four
that you have the green line that is
going from
top left to bottom right and that is
read
timeouts so the cluster
actually had about
15 to 13 percent
timeouts for the application wasn't
answering fast enough
within the time budget that my
application
needed and this again is an application
view this is not a cassandra metric this
is an application metric
and you can see how it totally flattened
out
on the uh right bottom right side of
graph number four
and the interesting thing here is why
why do we have this trend over time
because over these few days in december
what we did is we took and
upgraded one two three or five
nodes per day because again this was
just
knowing what we know today we would just
drop it on all of them at once and
that's it
but this was our proof of concept and we
we kind of upgraded our cluster over
time
so the more nodes received the zing
package
the better the timeouts were for that
node and then in average over the full
cluster
if this is a 60 server cluster and think
of the fact that i have
60 servers in each cluster times 6 for
all of my data centers globally
then this is a whole lot of servers
globally that i have to
upgrade and you can just see that
wonderful trend and then eventually
around the end of december
you see this flat line and think of this
as december it's the
peak time for advertisers for publishers
and with all that going on uh up into
up into christmas the vizing proof of
concept was just
proving itself to be so amazing you can
see
on the other graphs the same type of
improvement
on the top you can see the cassandra
metrics on graph number two
and on graph number three you can see
also the request latency
go down there on the nine nine nine yes
simon i'm sorry i was
no no no i was i was just gonna say i
think that's interesting easy because
i you took the approach that i think
everybody would take which is to go okay
let's you know just do it gradually
because we don't want to like put all
our eggs in one basket and suddenly
change everything and find that things
don't work
quite the way that we want them to but
i'm assuming that you didn't find any
problems
in terms of functionality changing so
everything just ran exactly the same as
it did
you when you used the old jvm switch to
zing and everything's just doing
exactly what it was doing before it just
now does it faster and doesn't
have the the read timeouts and things
which is exactly what you're looking for
and yeah as i you could clearly see that
the right approach was to do it
gradually but it's nice to see that
that sort of gradual approach and then
suddenly boom you've got everything
running there and
it just flat lines at the bottom
absolutely
and it flatlines in a good way yes
these these are of course much these are
from two weeks ago
uh we had another another application
internal application that was yet to run
on zing and the r d
team there was going why like why why
not us and i said
guys you want it you can have it uh test
it see what it does for you
so this is already zing um on the on the
version
on the on the 20s i mean this year's
version
and this is as i said two weeks two
weeks old graph
and you can see here that this is the
internal application
um from a different uh proof of concept
this time and you can see
just the cpu consumption so you're
looking at
the same application two identical
servers in terms of hardware and you see
the yellow line
is or the yellow yellowed out line is
the um
is the zing server and the green
area is the nosing server and you see
the
really big differences in cpu only with
this difference
just looking at this before we go into
the rest of the graphs you can see that
i can save
on the amounts of servers because my
server count
is and or actually my let's call it it
footprint
has a huge impact on my hosting costs
my running costs and my ability to serve
recommendations
again let's go into those really big big
numbers when you provide
three billion web pages a day you need a
whole lot of servers
the less servers you have the better the
economics work
yes i was going to say i think this is
also a very interesting graph
from the point of view of showing that
because we do
garbage collection concurrently with
application threads
some people think that the problem might
be well okay now you're actually placing
a heavy load on the system
and so you're going to degrade the
amount of throughput that you get with
the application because now you're doing
garbage collection work at the same time
and the way we get around that is
because we've actually changed the jit
compiler as well
um so we use a jit compiler called
falcon rather than c2 that you get in
the standard open jdk software and that
enables us to sort of
um compensate for the fact that we're
doing garbage collection simultaneously
and still get the performance so that as
you can see here
you're you're delivering lower cpu
utilization
with that low latency as well so it's
kind of a win-win yes
and the next graph is the timeouts so
there weren't a lot of timeouts you see
this is 0.1 percent
but only one server out of the two has
timeouts
uh you see only one spike very small
spike here and a very small spike here
but only one server has these timeouts
and that's the not zing server so we
kept the same color scheme
so if you look this is this is the same
time frame so if we're looking between
2200 and midnight and again between 2200
and midnight we see
the the kind of peaks of timeouts
and we see that the lower the cpu was
the less timeouts
we saw on the server but the zinc server
had no timeouts at all looking at the
next graph
i'm sorry i just just made one comment
on that which is that where
you did see one one spike there and
sometimes there are things that we can't
account for which is that the underlying
operating system
there might be some scheduling things or
you know timeouts that happen
um around the hardware or something like
that so
although i'm not claiming that that's
exactly what it is but often there are
things that we can't compensate for so
we can actually get a flat line
but sometimes we see uh artifacts of the
fact that
by eliminating garbage collection pauses
what you're now seeing is
other underlying things so just just
absolutely and if you look at the at the
space
under the graph the total amount of
requests that timed out
one spike here and a tiny spike there
compared to constant timeouts isn't even
comparable
so i wouldn't i personally we didn't
worry about that
and then this um here is a bit of a
different coloring scheme because this
is a view from the load balancer so i'm
sorry we changed the the coloring scheme
so it maybe is a bit
harder to follow but you see the purple
line
is the zing server and you see again the
two
short kind of spikes of 500s
but if you look at the non-zinc server
you see it's it's showing 500s so if you
take
again these two yes as you can see when
i saw that graph i didn't even realize
there was a second line on there
yes it's it's almost it's totally flat
lined except
these two little spikes here so the
the the little two spikes that that
approximately 3
a.m and between uh 2200 and 2300
and if you're looking at the the red
line again you see that we're constantly
reaching over two percent
of upstream errors which means that two
percent here and then the timeouts
that were in the form of graph all that
put together
shows me that this server which is
identical hardware
identical amount of load is getting me
less results it's manufacturing
less recommendations and that is very
critical for me
looking at busy threads this is another
interesting view
and again corresponds same colors as
before same timeline
corresponds to the zing and not zing
and you see the busy threads spike up
on the non non-zing server
a nice graph here i'm sorry do you want
me to no sorry
i'm just thinking that that that kind of
ties in with exactly what we've seen
with the other graph so it's a nice
proof of the way that zing works in
terms of eliminating the
the issues that you were seeing
yes and now looking at the at the time
at the 99th
percentile and this is the net time that
the requests are are
taking then you see here that on the
99th percentile
the zinc server is still providing lower
cpu
we saw that in the former graphs but
here we also see that it's performing
comparably at the 99 percentile so we're
not losing any speed
of answering requests and we see that in
terms of the
total amount of what we can do with this
server we can actually do
about 30 percent more
so with this specific application
just by moving and of course because we
are now
much more um i'd call it
familiar and and and confident with our
with our zing capabilities
we just replaced it over a night on all
of our servers
and just kicked out 90 servers so i now
have
a spare of you know i have in my pool 90
servers that i can allocate to a
different place and that's 30 percent
savings
on this specific application just by
moving them
to zing so from a cost point of view
you figured that was um that was an easy
sell
that was an extremely easy sell because
looking
at an application and looking at the
server and looking at what we can do
with it
our ability to provide services with a
smaller it put
footprint our ability to serve our
clients
faster better with less errors all of
that together
is eventually revenue a on the lack of
errors and better serving that's revenue
goes up
on the reduction of server footprint
that is revenue that i'm or capital that
i'm keeping home
and i'm not spending so yes there is
some level of spend on the
on the zoo licensing but you put all of
it together
it totally fits and there's a proven roi
for this project and that's why we are
continuing to deploy
zing wherever we just can it actually is
an extremely popular
application or i'd call framework within
taboola
where it became a brand name with our
developers
they are aware of it it's not me and i t
that i need to push this out and say oh
no no new application make sure that
you're
testing it on uh the correct the correct
jvm make sure that you're
um in production and you're zingified or
however you would like to call it
you can just it's just grass rooted and
everyone wants it now
right i was gonna say i imagine your cfo
is very happy when he looks at the the
numbers for that kind of thing
yes absolutely good okay well um
what we'll do i'll just just um just to
conclude the the webinar
part of things and just mention a few
things around zing obviously
we've heard the success story here and
and how uh you've had some
really quite impressive results that
have uh helped
immensely in terms of um the data
footprint you've got in the
number of machines and so on and that's
what we're really trying to do with zing
is to produce a low latency high
throughput jvm
and again as you said is it's the
simplicity of it being a drop in
replacement
you don't need to recode anything you
don't need to recompile you don't even
need to change your startup scripts it's
a very simple migration from using
the old jvm to using zing and
what we're really focusing on especially
with with your type of application loads
is eliminating timeouts
by eliminating the latency associated
with garbage collection
uh specifically making sure that your
users meet their expectations
and also supporting bigger workloads on
the same hardware
which as we've already explored reduces
the cost because
you're reducing the provisioning costs
and so on and what i would say is if
anybody's interested in this
and trying it with their own
applications then you
uh we have a free 30-day trial for zing
you can go to azure.com
zing trial you can download it we have
engineers who can help
you in terms of setting things up making
sure that everything's running the way
that it should do
even though there's a drop in
replacement obviously one of the things
that we like to help
customers with is setting up the way of
measuring the performance
and so we've got some nice tools that
help people to understand exactly what
the performance level is
before using zing and then using zing
with their application we can show the
effects on latency for the jvm because
it's quite important to do that
because although obviously you're
looking at application level
what are your users getting it's also
important to understand how is the
effect
in terms of the application interacting
with the jvm so we've got tools that we
can help with that and produce some nice
graphs
again that you can show to your cfo and
say look this is the the results we get
this is why we need zing
um so that that's pretty much the end of
the slides in the the presentation part
so i guess what we'll do now is we'll
see if
anybody has any questions uh so if we go
to the
hopefully there'll be some questions
i can share how we started out do you
have any questions or do you want me to
share
oh no if you share how you started out
and then we'll see if anybody has
any questions yes happily so
um this was over three years ago when we
started out we actually have
a few we had back then a few monolithic
applications
and we started with a one terabyte
of heap application that was a single
server and and it had this huge heap
and that was our kind of first test
where we had these
15 minutes cycle of garbage collection
where the server would just stop
responding
for 15 full minutes and would
would do the garbage collection which is
just the way java works and
we would accept that as long as because
this is a
back-end application it was our billing
processing
and once we were able to optimize that
and we suddenly saw that we're getting
this flat work there that was
um one of the first time aha moments
but i do want to say something about
cassandra that i didn't have a chance to
say earlier
with zing we are able to do something
amazing
that is just unattainable unattainable
without zing
we reduced our node count and
created extremely dense nodes
over our hardware so we're using the
same hardware
but we are now able to put multiple
terabytes of data
on a single cassandra node which is
usually not recommended actually
if you go to the kind of best practices
you're supposed to do
up to one terabyte yeah up to one
terabyte of
physical storage per cassandra node but
we're going
way beyond that and without zing that
could not have been done
due to latency due to heat issues due to
uh the actual ability of the server to
answer
quickly so there were not only the
ability to answer faster but the
ability to condense our cluster into
more dense nodes and now i see we have
multiple questions i was gonna say so
we've got some questions now so the
first question
is what gc algorithm were you using or
were your applications using before you
installed zing
do you know i would assume you're
probably using g1
g1 exactly we're using uh g1 and um
and again this was four years ago and uh
that was
i think we played around with a few
others but g1 was the only
the only one that in scale at least uh
was was able to cope with what we were
doing and then we moved on
into zing once we saw the light right
uh second question could you talk about
the efforts you spent on
tuning the nonzing jvm compared to the
performance work you did
since zing uh so that's kind of an
interesting question did you spend a lot
of time trying to performance tune
the old jvm so
uh this was this was years ago and um
on the we did on the monolithic uh
application
less on the front-end servers the
front-end servers we were going like
we don't have a problem it's you know
these small freezes you don't even
really grasp what's happening there
uh until you until you see but yes we
tried
uh we tried different um different
approaches we tried to reduce our heap
size we tried to increase our heat size
we tried to change our cash um
our cash strategies we tried to do
multiple different things
in terms of how we can optimize our
application
and actually even since moving to zing
we've been working with azul
on our uh on our performance and
continuously beyond the just
default improvements we've been
continuously seeing
additional improvements on many fronts
some of them around avx-512 some of them
around um cpu speed some of them around
cpu pinning some of them around numa a
lot of different
uh ways that we that we've been tweaking
this around
but by far the easiest and fastest and
most
um the word lucrative isn't isn't
correct here but the most beneficial i
think that's the best word to use here
for us in terms of cost performance was
just
we moved to zing and we get all these
benefits
uh behind it it was so simple and
allowed us to focus
on uh system optimizations drive
optimizations
um application optimizations and and the
rest
that that is the thing that made most
sense for a sense
for us okay um
and sort of related to that what um the
question is what tools did you use to
monitor zing
okay so uh our monitoring
is based we don't monitor zinc directly
we monitor the whole the whole system
and there's a whole monitoring
framework in there so the observability
part is based on
grafana the metrics are
the databases prometheus it was a metric
tank back then
the um log ship the metric shipping
was based uh is based now on on kafka
i'm trying to think what was based back
then it was senzu
as the age as the local agent there
there's a whole lot of different there
there was not one tool that we used but
all the graphs you saw here
are our grafana graphs the database now
is as i said is the prometheus
uh database and we're pulling the the
data
with the with the local agent so that's
um i guess the easiest and shortest
answer i can give here
if anyone wants to reach out to linkedin
i can i can provide a much much longer
answer okay another question here is
you spoke about latency flattening have
you ultimately managed to reduce
costs by migrate migrating to zing um i
think
yes so i i spoke about that uh briefly
but i can
i can elaborate uh here a bit more we we
saw
great cost savings um just by
in a few places first our ability to
reduce the server the server count in
terms of the clusters
we reduce the size of a clusters that's
one
then two uh we improved
our our actual front-facing application
so if
earlier four years ago we would be
missing
one percent or two percent of our
requests
timing out or just working slower for
the general
user population now we're answering much
faster and
i think it's it's almost now a common
knowledge that the faster you are
online the the more chances you have of
a user actually
getting that piece of content on his
device and actually having the ability
to click on it
because if you're too slow the user just
doesn't get to see you
uh so it's it's been improving our
application
reducing our i.t footprint reducing our
hosting costs
delaying the need for new hardware all
of those together
uh brought us to the the relevant cost
reductions and beyond
yeah so i think there's two things that
isn't that there's obviously the ones
you can easily look at which is
you can measure how many servers you
saved how much uh you spent on the
licenses
but then as you said there's the the
cost or the uh the benefit
of being able to serve your customers
quickly and therefore get more
uh business if you like through that um
there's another question here which i
think we answered or you answered at the
beginning which is how large is the
estate and i think you did say how many
machines you have
in your yes so we have 8500
servers that's 8 500 circle we actually
have a bit more now that's not a
new slide uh so we have a whole lot of
servers
um not all of them are our java or zing
but all
but but we're at thousands of servers
with zing
some of these servers are hdfs for
example hdfs
doesn't get as much
benefit out of zing or we have other
technologies in taboola such as
vertica which again is not a java based
java based
application so wherever we are java by
now i don't think there's a corner in
tabula
where we we have a java application and
we don't have uh zing beneath it
powering it up good good to know good to
know
right um and then i got one more
question which is um
we tuned jvms based on load how did you
have to tune
uh batch versus online for zing
uh does that make sense very interesting
yeah very interesting so
we have our front end applications which
is actually
all the graphs i showed were for our
front end applications and i only spoke
briefly
at the end now about our batch
processing with that
big monolithic one terabyte application
that we started out with
so on front-end applications it's it's
almost a no-brainer um
you see you have 20 servers or 60
servers or 100
in our case hundreds of servers if
you're on aws of course if you're on the
cloud
then instances but anyone with whatever
their cost model is
and you can just reduce that footprint
so that's an easy
and easy way to look at it because you
can do just
more um transactions per server
more transactions per second or you have
less errors whatever the metric or the
relevant metric for you is if it's
latency
sheer volume cpu load whatever
on the back end applications um what
actually was that was the selling point
for us
is this is actually i'm smiling because
i i
now i'm recalling that feeling i'll tell
you about it so we had this back-end
application as i said which was our
billing application
and it would crunch all the data coming
in so we have the single server
it's now on spark it's now totally
different with uh with a totally
different technology but back then
it was this big monolithic java
application
and we have all the billing data coming
in all the lines of logs coming in and
it would pick them up from the drive
and then crunch crunch crunch punch
crunch then and provide a
status of um of where the where where
our system is today in terms of billing
and think that it had to crunch billions
of lines of logs a day
now it would pause on on um
on gc every 15 minutes and if there were
and just think of the boot time of this
server and think of the
if let's say you had a version on it
that was problematic
and so you see the line going down but
then suddenly gc oh okay now you're
waiting it's going up you're waiting
waiting
uh it finished the gcf for 15 minutes
and then it's processing again and
you're trying to see is it catching up
or not
and i remember days on days just looking
at these graphs of
the the latency of this server how much
catch-up time
is it taking and every time we had this
big maintenance work or
we had to deploy on it i mean who
remembers deployments anymore
we we're down continuous deployment
we're not there we don't live there
anymore
but back then this was a huge thing
moving to zing
and this sun flattening of the line no
more gc no more
hold the world i want to just garbage
collect now
and that kind of feeling of
i can just see what the server is doing
continuously
no this this really this feeling of
relief so i know that's not a number i'm
sorry it's not a number but it was
a single server so luckily enough
when when we when we did the licensing
uh a single server
is not a big deal in terms of licensing
and just for that feeling
i mean it was just it was it was the um
it was the i.t administrator
appreciation day last friday i mean
feel it for me man feel it for me uh
whoever the question came from
yeah that's excellent um i'm always
happy with happy customers even if it is
only one license if
it made your life much simpler then
that's better okay um
that seems to be all the questions we
have um and so we're at about 40 minutes
so that's that's pretty good
so i think what we'll do is um oh no
hang on um
we're in the same boat now uh it's
challenging to convince the management
on how much cost savings to bring to
organizations thanks for sharing this
information on working 24 7. um
great so hopefully we've got another
person there that's looking at zing and
and
we'll try to use the trial and and
hopefully uh get the same
benefits that you've had from that um so
uh just to wrap up i would like to say
that
um as i said at the beginning we have
recorded this session and we'll be
sending everybody a link
to the recording so you can share it
with family and friends and we will also
send you a copy the slides
and i would say like to say a very very
big thank you to you ariel for
all of the information you shared with
us and um and
like i say it's it's a happy customers
is a happy
azul so with that um thank you
to everybody for attending the webinar
thank you for having me