Manual QA is boring
Also, humans…
• are slow
• require food / water / sleep etc.
• make mistakes
But writing browser tests is
also boring …
• Same pattern, over and over:
• visit page
• click stuff
• check things have changed
• fill in a form
• click something
• check things have changed in the right way
• repeat ad nauseam
We can’t find good programmers
who are willing to do this
• So either you hire bad ones and get…
• poorly maintained code
• lots of hand-holding
• Or your systems don’t get properly tested
When programming is
boring, automate it more
• Memory management is boring
• => garbage collectors
• Style-checking code is boring
• => automatic linters
• Remembering opcodes is boring
• => assembly language
• Extracting a method is boring
• => automatic refactoring tools in IDEs
• Everything is boring
• => macros
So what are integration
testers actually doing?
1. Look at current state
2. Pick something to do
3. Tell computer to do it
4. Check that the state changed as you’d expect
5. GOTO 1 UNLESS DONE
But how do they know how to
do these things?!
(if it’s not simplified, then it’s actually just the system)
(map vs territory)
Where does that model
come from?
• asking other people (devs, designers, managers,
etc.)
• reading the source code
• documentation (diagrams, free text)
• “common sense” reasoning (e.g. if a button says
“add” it should probably add something)
The model is implicit
• Not encoded in a way that a computer can do
useful things with it
• You can kinda infer the model from reading tests
(which is why Rubyists are fond of calling their
tests specs, right?)
• but there is so much to gain from having a model
be machine-readable
What we want
• generate a series of user actions (a “journey”)
• => [ (add todo called “hello”), (mark it as done) ]
• know how to perform those actions on the real system
• => fill in text box with hello, press enter, click tick on the new list item that just
appeared
• know what those actions should do to the system state
• => after you add the first todo, there should be one todo listed in the todo list
• be able to confirm those actions did what they should to the real world
• => expect($(‘li.todos’).length).toEqual(1)
A model we use at Zendesk
• Create apps by writing JS / HTML / CSS / JSON,
zipping it up, uploading it (either via web interface
or REST API)
• Install apps by choosing an app through the web
interface and clicking install OR posting to
/api/v2/installations.json and giving it a the ID of
the app you want to install
Example journey
• Generate a valid app
• Post its details to the /api/v2/apps.json endpoint
• Generate new installation details, using ID we got
back from the Apps endpoint
• Post to /api/v2/installations.json
• Confirm that an installation with the right details
exists (the right name, app_id, etc.)
So what does our model
need to contain?
ate a journey, then undertake it, and make sure everything works
• when is it possible to do this?
• e.g. can’t install app if you haven’t created one, can’t delete a todo
unless you’ve added one
• how does it transform state?
• e.g. installing app increases installation count by 1
• what does it “do”? how is the action actually performed in the real world?
• in browser: click on app in marketplace, click “install”, etc.
• do a POST to /api/v2/installations.json
• What what kind of data does it need?
• i.e. what does an app look like? what’s going to be in that zip file?!
For each type of action:
Create App action
• Can always create an app!
• transform: It adds the created app to the list of
apps in our state
• To “perform” on the real system, it serialises the
generated app data into files (JSON, .js., folder
structure, etc.), then zips it up, uploads it to API
(note: actually more complex now, as model has grown)
Disclaimer
• This stuff could be a whole lot neater
• My excuse:
• Also, I’ve prioritised growing the model over making it clean, because:
• it’s really useful (this is running on every deploy, and catching bugs
already)
• I don’t want to prematurely abstract before I understand what
complicated models look like
How do we generate test
data?!
For all vectors of integers, v,
sorting v is equal to sorting v twice.
(doesn’t just brute force it, in case you’re wondering…)
serious limitations though
(and the creator seems okay with that)
(I don’t know enough combinators and monads)
(first time I’ve ever found myself thinking that)
But how do we get from actions
+ data to a full “user journey”?
“this is not ideal”
- Alistair Roche, August 2015
lesson learnt: writing macros to transform code to
feed to other macros is not fun
that whole data > fns > macros thing is true
(but also it works fine for now so CBF changing it)
(clojure.test integration, so I can just run “lein test” on jenkins)
to keep examples simple, in our live version I also have uninstall
ertions are just sugar on top, most scary are going to manifest as
Does it make maintenance
easier?
• Imagine you can now only install an app twice if it
doesn’t depend on ticket fields (this was a real
change)
• Instead of going through and changing each spec
where I install an app-which-depends-on-ticket-
fields twice, I just change one place: the
:possibility-check of the :install-app action
Will it catch more bugs?
• Given enough time and a representative model, it
will explore more of the possible code paths /
state spaces than a human would
• so yes.
But what about critical
paths?
• Would like to add at some point a way of weighting the
likelihood of transitions, so that you’re more likely to get
a journey that creates, installs, uninstalls, deletes an
app, rather than [:create :create :create :create]
• (right now it’s a random choice of all possible actions.
could be weighted. based on analytics / logs?!?!)
• super easy to do this: just another entry in the action
map which puts a probability on the action being
followed by the other actions. or could be a function,
and the weight is based on state…
Can use same model for API
tests and browser specs
• Currently doing API specs
• BUT have browser specs prototyped
• it’s just a matter of changing the :perform function!!!
(instead of POSTing to API, visit a page and click a
series of buttons)
• have done browser tests like this for another Zendesk
App as part of lab day but (I SWEAR THIS IS TRUE)
my laptop broke and I hadn’t pushed to GitHub
What happens if you share the model
between test and application code?
would look SO MUCH BETTER
if i used clojure’s polymorphism features
like multimethods or protocols
Far far future
• Fire ’n’ forget blackbox testing
• Point it at a legacy system with an interface
(whether that interface is the DOM, a SOAP API
(lol), a set of COBOL procedures)
• Let it discover through trial and error how the
system works
Origin story
• I gave this problem to a test engineer I was
interviewing (on a busted old Zendesk system (I
didn’t tell him how it worked (this sentence looks
like Clojure code (now)))) and watched his
learning / discovery process
• Trial and error mixed with abstract pattern
recognition
licking every button, entering stuff into forms, etc., etc. (aka “fuzzi
but unlike fuzzing, watch what changes in the DOM
spot patterns
tton, this text changes, and that text looks like a number, and it’s
(way more sophisticated systems outside
of mainstream proglang ecosystems, by the way)
• my guess is it’s gonna need a human showing it
some stuff to constrain its actions / thinking, for it to
be useful beyond simple systems
• user with a browser extension can quickly teach it
which buttons tend to be clicked in what order, or
what type of data a field expects, or how inputs
relate to outputs
• could it learn by watching the millions of users
using the app?
White box future
• Model of application behaviour stated
declaratively
• Then shared between test code and application
code (not to mention documentation
generators…)
• Have you ever thought about how many places
business rules are encoded? In code repos,
neurons, wiki articles, chat logs, animated GIFs
Generative Testing in Clojure
Generative Testing in Clojure
Generative Testing in Clojure
Generative Testing in Clojure

Generative Testing in Clojure

  • 2.
  • 3.
    Also, humans… • areslow • require food / water / sleep etc. • make mistakes
  • 4.
    But writing browsertests is also boring … • Same pattern, over and over: • visit page • click stuff • check things have changed • fill in a form • click something • check things have changed in the right way • repeat ad nauseam
  • 5.
    We can’t findgood programmers who are willing to do this • So either you hire bad ones and get… • poorly maintained code • lots of hand-holding • Or your systems don’t get properly tested
  • 6.
    When programming is boring,automate it more • Memory management is boring • => garbage collectors • Style-checking code is boring • => automatic linters • Remembering opcodes is boring • => assembly language • Extracting a method is boring • => automatic refactoring tools in IDEs • Everything is boring • => macros
  • 7.
    So what areintegration testers actually doing? 1. Look at current state 2. Pick something to do 3. Tell computer to do it 4. Check that the state changed as you’d expect 5. GOTO 1 UNLESS DONE
  • 11.
    But how dothey know how to do these things?!
  • 15.
    (if it’s notsimplified, then it’s actually just the system) (map vs territory)
  • 16.
    Where does thatmodel come from? • asking other people (devs, designers, managers, etc.) • reading the source code • documentation (diagrams, free text) • “common sense” reasoning (e.g. if a button says “add” it should probably add something)
  • 17.
    The model isimplicit • Not encoded in a way that a computer can do useful things with it • You can kinda infer the model from reading tests (which is why Rubyists are fond of calling their tests specs, right?) • but there is so much to gain from having a model be machine-readable
  • 20.
    What we want •generate a series of user actions (a “journey”) • => [ (add todo called “hello”), (mark it as done) ] • know how to perform those actions on the real system • => fill in text box with hello, press enter, click tick on the new list item that just appeared • know what those actions should do to the system state • => after you add the first todo, there should be one todo listed in the todo list • be able to confirm those actions did what they should to the real world • => expect($(‘li.todos’).length).toEqual(1)
  • 22.
    A model weuse at Zendesk
  • 23.
    • Create appsby writing JS / HTML / CSS / JSON, zipping it up, uploading it (either via web interface or REST API) • Install apps by choosing an app through the web interface and clicking install OR posting to /api/v2/installations.json and giving it a the ID of the app you want to install
  • 24.
    Example journey • Generatea valid app • Post its details to the /api/v2/apps.json endpoint • Generate new installation details, using ID we got back from the Apps endpoint • Post to /api/v2/installations.json • Confirm that an installation with the right details exists (the right name, app_id, etc.)
  • 25.
    So what doesour model need to contain? ate a journey, then undertake it, and make sure everything works
  • 26.
    • when isit possible to do this? • e.g. can’t install app if you haven’t created one, can’t delete a todo unless you’ve added one • how does it transform state? • e.g. installing app increases installation count by 1 • what does it “do”? how is the action actually performed in the real world? • in browser: click on app in marketplace, click “install”, etc. • do a POST to /api/v2/installations.json • What what kind of data does it need? • i.e. what does an app look like? what’s going to be in that zip file?! For each type of action:
  • 27.
    Create App action •Can always create an app! • transform: It adds the created app to the list of apps in our state • To “perform” on the real system, it serialises the generated app data into files (JSON, .js., folder structure, etc.), then zips it up, uploads it to API
  • 29.
    (note: actually morecomplex now, as model has grown)
  • 30.
    Disclaimer • This stuffcould be a whole lot neater • My excuse: • Also, I’ve prioritised growing the model over making it clean, because: • it’s really useful (this is running on every deploy, and catching bugs already) • I don’t want to prematurely abstract before I understand what complicated models look like
  • 31.
    How do wegenerate test data?!
  • 32.
    For all vectorsof integers, v, sorting v is equal to sorting v twice.
  • 38.
    (doesn’t just bruteforce it, in case you’re wondering…)
  • 40.
    serious limitations though (andthe creator seems okay with that)
  • 41.
    (I don’t knowenough combinators and monads) (first time I’ve ever found myself thinking that)
  • 44.
    But how dowe get from actions + data to a full “user journey”? “this is not ideal” - Alistair Roche, August 2015
  • 45.
    lesson learnt: writingmacros to transform code to feed to other macros is not fun that whole data > fns > macros thing is true (but also it works fine for now so CBF changing it)
  • 48.
    (clojure.test integration, soI can just run “lein test” on jenkins)
  • 49.
    to keep examplessimple, in our live version I also have uninstall
  • 50.
    ertions are justsugar on top, most scary are going to manifest as
  • 52.
    Does it makemaintenance easier? • Imagine you can now only install an app twice if it doesn’t depend on ticket fields (this was a real change) • Instead of going through and changing each spec where I install an app-which-depends-on-ticket- fields twice, I just change one place: the :possibility-check of the :install-app action
  • 53.
    Will it catchmore bugs? • Given enough time and a representative model, it will explore more of the possible code paths / state spaces than a human would • so yes.
  • 54.
    But what aboutcritical paths? • Would like to add at some point a way of weighting the likelihood of transitions, so that you’re more likely to get a journey that creates, installs, uninstalls, deletes an app, rather than [:create :create :create :create] • (right now it’s a random choice of all possible actions. could be weighted. based on analytics / logs?!?!) • super easy to do this: just another entry in the action map which puts a probability on the action being followed by the other actions. or could be a function, and the weight is based on state…
  • 55.
    Can use samemodel for API tests and browser specs • Currently doing API specs • BUT have browser specs prototyped • it’s just a matter of changing the :perform function!!! (instead of POSTing to API, visit a page and click a series of buttons) • have done browser tests like this for another Zendesk App as part of lab day but (I SWEAR THIS IS TRUE) my laptop broke and I hadn’t pushed to GitHub
  • 56.
    What happens ifyou share the model between test and application code?
  • 58.
    would look SOMUCH BETTER if i used clojure’s polymorphism features like multimethods or protocols
  • 64.
    Far far future •Fire ’n’ forget blackbox testing • Point it at a legacy system with an interface (whether that interface is the DOM, a SOAP API (lol), a set of COBOL procedures) • Let it discover through trial and error how the system works
  • 65.
    Origin story • Igave this problem to a test engineer I was interviewing (on a busted old Zendesk system (I didn’t tell him how it worked (this sentence looks like Clojure code (now)))) and watched his learning / discovery process • Trial and error mixed with abstract pattern recognition
  • 66.
    licking every button,entering stuff into forms, etc., etc. (aka “fuzzi but unlike fuzzing, watch what changes in the DOM spot patterns tton, this text changes, and that text looks like a number, and it’s
  • 68.
    (way more sophisticatedsystems outside of mainstream proglang ecosystems, by the way)
  • 69.
    • my guessis it’s gonna need a human showing it some stuff to constrain its actions / thinking, for it to be useful beyond simple systems • user with a browser extension can quickly teach it which buttons tend to be clicked in what order, or what type of data a field expects, or how inputs relate to outputs • could it learn by watching the millions of users using the app?
  • 71.
    White box future •Model of application behaviour stated declaratively • Then shared between test code and application code (not to mention documentation generators…) • Have you ever thought about how many places business rules are encoded? In code repos, neurons, wiki articles, chat logs, animated GIFs