👨‍💻 ESPHome: Nothing-to-firmware in 30 minutes

Last week, I wrote about the tech we deploy on my parent’s 100 acre property.

Our goal is to be able to quickly and cheaply integrate new sensors and controls. We want to be able to buy a flow meter / particulate matter sensor / pressure transducer / other crazy thing off AliExpress, and get it integrated into our control plane without it becoming a massive project.

Too many electronics and automation projects just die on the desk. We want the flexibility of our own electronics, but we want to be off the breadboard and into production as fast as we can.

As a software developer, the place I often getting tripped up is, amusingly, at the coding stage. Some fairly simple electronics suddenly need a lot of code to actually integrate well. Writing it yourself is a blackhole that’s easy to fall into.

Last year, I fell in love with ESPHome: the perfect piece of glue between ESP-based devices and Home Assistant.

Walk-through

I sat down and recorded a 30 minute walk-through of getting started with ESPHome.

To follow along, you’ll need only three things:

  1. A running instance of Home Assistant
  2. An M5Stack Atom Lite (or any other ESP-based device you’re already comfortable with)
  3. A USB-C cable

It’ll take you through:

  1. 00:40 – Why Home Assistant
  2. 02:40Installing ESPHome
  3. 03:20 – Creating your first ESPHome node
  4. 06:00Handling secrets
  5. 07:10Controlling the RGB LED on the M5Stack Atom Lite
  6. 09:10 – Doing the first flash via USB, with esphome-flasher
  7. 14:15 – Adopting the device in Home Assistant
  8. 16:00Detecting the button on the M5Stack Atom Lite
  9. 17:10Cleaner code with substitutions
  10. 18:30 – Doing the subsequent flashes over-the-air
  11. 20:45Adding light effects
  12. 21:15Adding automation on the device
  13. 23:50 – The firmware from my intercom project
  14. 25:40Lightning detectors and particulate matter sensors
  15. 26:30“Covers”, for garage doors, blinds, and pool covers
  16. 27:40 – Opinions on Tasmota, in this context
  17. 28:50 – Other devices, like the Shelly1

Final Code

Here’s the final code from this demo:

substitutions:
device_name: demo2
friendly_name: Demo 2
## Boilerplate
esphome:
name: ${device_name}
platform: ESP32
board: m5stack-core-esp32
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
# Enable fallback hotspot (captive portal) in case wifi connection fails
ap:
ssid: Fallback ${device_name}
captive_portal:
logger:
api:
password: !secret esphome_secret
ota:
password: !secret esphome_secret
## Device-specific
light:
platform: fastled_clockless
chipset: WS2812B
pin: 27
num_leds: 1
rgb_order: GRB
id: status_led
name: ${friendly_name} Light
effects:
random:
flicker:
addressable_rainbow:
binary_sensor:
platform: gpio
pin:
number: 39
inverted: true
name: ${friendly_name} Button
on_press:
then:
light.toggle: status_led
view raw demo2.yaml hosted with ❤ by GitHub

👨‍🔧 Maker tech on the land

I live in an inner-city apartment. There’s a concrete slab, brick walls, and no ceiling cavity access. Oh, and we rent, so drilling into things is frowned upon. The home automation potential in a scenario like this consists of some coloured lights, and watering about four plants on a timer. It’s not exactly an inspiring IoT site. There’s also not much wrong with the humble light switch that works instantly, every time.

In complete contrast to this, my parents live on 100 acres / 40 hectares of land, a few hours out of the city. It’s quintessential rural Australia. They’re mostly off the grid: there’s a skerrick of Telstra 4G signal, and some power, except when there isn’t, which is surprisingly often. This is an absolute IoT playground, and we’ve given that a fair crack over the seven years as they’ve established themselves there.

Opportunity

On a property like this, water is critical. There are the basic living requirements, the opportunity of gardens, and the very real risk of bushfires. All up, we have ~300,000 litres of water tanks on the property, and then a small dam and two creeks. We use sensors to track tank volumes. We measure flow rates at key plumbing points (for both instantaneous oversight and tracking cumulative usage). We use power meters to detect when pressure pumps are running.

Energy management is also important. Whilst there is the option of grid connectivity, we try to run as off-grid as possible, or at least export-only. This requires some smarts around load management. For example, instead of just having a hot water system that aggressively chases a specific temperature, we want the hot water systems to head for a temperature range, but only use excess solar production to do it. If there’s a bit less sun for the day, it’s ok if the water it a few degrees cooler: don’t burn down the batteries or import power just to hit a specific temperature.

And then there’s a safety aspect. The property is on the top of an escarpment where storms can roll in fast from a few different directions. By running things like lightning sensors on the property, we can trigger our own localised alerts for approaching storms.

Challenges

The challenge is to convert all these possibilities into something real, that works, and doesn’t cost an absolute mint. Over the years, we found this incredibly hard to solve for. You’ll find a solution for measuring tank volume, but it comes with its own LoRA gateway, a cloud dependency, and a new app. You’ll find a cheap Z-Wave temperature sensor, but it’s only really good for a room, and doesn’t have a probe that you can put into the measurement point in a hot water system. You’ll find a flow meter, but it’s only an industrial solution that wants to talk RS485 serial. Who even does that anymore⁈ You’ll find a garage door opener that works for the extra-high roller doors on the shed, but it has its own 433MHz RF remote.

It’s easy to end up with a horrible mishmash of radio technologies, software platforms, and APIs, not to mention some scary pricing as you drift from the traditional world of home automation into the more dated world of industrial automation.

Goal

We want to be able to dream up crazy new ideas, pick and choose the right sensors, and then integrate them with relative ease and consistency. That means a balance between the ability to build new things ourselves, but not having to go and custom fabricate a circuit board and write new firmware just to add a new sensor.

Considerations

Sense

Most sensors give out some kind of analogue signal (like a pressure sensor where the output voltage varies from 0V to 5V depending on applied pressure), a pulse (like a flow meter that pulses once for every 500mL of water that flows through it), or a digital signal (like a temperate and humidity sensor with a 1-Wire output).

To handle all of these scenarios, we’ll need some GPIO pins (the most basic of digital connections), and something with an analogue-to-digital-converter onboard (so that we can measure that variable pressure scenario).

Affect

Most of the outputs that we’re dealing with are straight up binary: turn a pump on, open a valve, flash a light. This means that they can be again driven by some GPIO pins, paired with an appropriately sized relay.

For more complex outputs, like sound, we can defer back to more consumer-grade hardware, like just broadcasting an announcement on the Google Home speaker in the kitchen.

Plug and Play

As much as we’re building things ourselves, we don’t need to put too many barriers in front of ourselves. We’re after a module that we can connect sensors to easily, without too much soldering, and certainly not having to build up our own circuit boards / Veroboard. We want to be out of the lab and into the environment as fast as possible.

Secure

There’s a persistent joke about how the ‘S’ in IoT stands for security.

Security was absolutely a consideration for us though: both on-property, and when it comes to remote access. When you’re switching things like power, or controlling a precious resource like water, you want to be confident that you’re the only person in control.

Our preference has been to keep connectivity and control local to the property, via authenticated connections, and then apply a separate remote access approach further up the stack. This means the hardware needs to have enough power and smarts to handle secured local connections, and not be dependent on its own path to the internet.

Compute

Some sensors require both precise timing and persistence to count things like pulses and turn them into natural measures. For example, a flow meter might give you a signal pulse for very 500mL of water that flows through it. If you miss a pulse and stop counting for a bit, you’re missing water. Our preference has been to count pulses on the hardware attached to the sensor, and then report back the natural values (L/min, L/day) whenever the network and central compute is available. We want to keep sensor-specific concepts, like pulses, at the sensor, and just send meaningful information over the network.

As much as we want things to be connected, they can still be somewhat smart in their own right. If a hardware module has both a temperature sensor and a relay to turn a heating element on and off, it’s perfectly capable of being a thermostat on its own, regardless of what’s happening to the wider network or any centralised compute. It will be smarter when the central controls are online, because it can be aware of other datapoints like the solar charge status, but it doesn’t have to be totally bound to the availability of those other systems to operate its basic function. Achieving this balance requires us to be able to run basic logic directly on the module.

The world is not static. If we want to run logic on these devices, and keep them secure, they need to be able to receive firmware updates over-the-air. We don’t want to be clambering around under sheds with laptops to reprogram a thermostat.

Connect

Network standards are still the multi-billion-dollar question in IoT.

Early on, we deployed a number of Z-Wave and Zigbee based devices. These are two different mesh protocols, at the mid-point of their VHS vs. Betamax battle for dominance. They’re common in consumer automation solutions like smart switches, and good for extremely low power environments, like where you want to be able to throw a temperature sensor in the corner of a room with a coin-cell battery in it and then forget about it for a year. The sensor ecosystem is very consumer focussed (you’ll find a million temperate sensors, but no tank pressure sensors). The communication protocol is constrained: by design, it’s a very low bandwidth data network, operating up to 40kbps for Z-Wave, or a whopping 250kbps for Zigbee. Range is limited to keep within the power limits, so typically as low as ~15-20m. There’s no common way of building on-device logic, and if you do manage to, then it’s incredibly hard to apply firmware updates over-the-air for either of them.

Our exploration continued into LoRA and NB-IoT, but we’ve opted away from both for this property. They’d each be very compelling options if we wanted to cover the larger property, such as if it was more of a working farm with distributed infrastructure than predominantly bushland and gardens with clustered buildings.

Ultimately, we settled on Wi-Fi as our preferred connectivity on the property. We’re already got great coverage throughout the house, cottage, shed, and key outdoor locations via a UniFi deployment. Whilst this is heavily centred on the built infrastructure, and not the full 100 acres of property, that’s where most of the sense-and-act needs to occur anyway. Investing in the Wi-Fi coverage provides other benefits, like Wi-Fi calling for mobiles where we’re otherwise on the fringe of 4G coverage. The UniFi infrastructure is readily extensible, as Lars has proven on his property with the deployment of Llama Cam, and even extending the coverage to places without power. Finally, it gives us a pretty amazing management plane that’s a lot nicer to work with than trying to diagnose a Z-Wave/Zigbee mesh.

Topology diagram of UniFi network showing switches and access points
UniFi infrastructure topology

Power

We’re happy to depend on a power source for most devices: they’re usually located near other loads that are already powered, or it’s easy enough to get power to them, such as dropping in a 12V cable out to a water tank at the same time as trenching the plumbing in. If we really want to run on battery, it’ll be ok to have a larger 12v battery and a small solar panel or something: we don’t need to try and run off a coin-cell battery for years on end.

Cost

The approach needs to be reasonably affordable if it’s going to grow over time: tens of dollars to add a new thing, not hundreds.

Sensors themselves range from $1 for a temperature probe, to $20 for a lightning sensor, or $30 for a laser-based air particle sensor. Short of buying them in bulk, that’s about the best the prices are going to get: you’re already buying the smallest possible unit straight out of China.

Whilst there’s no room to optimise on sensor cost, it does give something to calibrate against for the target cost of the compute module. We wanted to target ~$5 per module. It felt like the right cost, relative to the sensors themselves.

Hardware Options

So many options! Let’s run through them against those considerations:

Raspberry Pi

By this point, most software developers are jumping in with “Raspberry Pi! You can even buy hats for them!”. That’s certainly a response I started from. It feels safe: it has ports that you’re used to, you can plug in screens, it runs a whole operating system, and it has an IO header exposed for those plug-and-play scenarios.

Photo of a Raspberry Pi 4 module with ports annotated
Raspberry Pi 4

They’re also completely overpowered for what we need here. These things are small PC replacements: they’re full computers, not basic microcontrollers. Memory capacity starts in the gigabytes when we only need megabytes. They can run dual 4K display outputs when we only need to switch a couple of wires on and off. They need an SD card with a whole operating system on it just to boot. They suck up 3A of power, which is at the expensive end of USB power supplies. They’re also quite bulky compared to other options we’ll look at through this post.

A Pi, power supply, and SD card will quickly run to ~$50, which is 10x our cost target.

✅ Sense
✅ Affect

✅ Plug-and-Play
✅ Secure

🤯 Compute
✅ Connect

😐 Power
❌ Cost

Raspberry Pi Zero W

This is as small as the Raspberry range goes, but it’s still a similar position as the main Raspberry Pi models: completely overpowered for this purpose, with a 1GHz processor, 512MB of RAM, and HDMI video output. Thankfully, it’s less power hungry, so we’re back into the range of any-USB-power-will-do. It’s also physically a lot smaller: look at the relative size of those micro USB ports.

Photo of a Raspberry Pi Zero W module
Raspberry Pi Zero W

It still needs a micro-SD card in addition to the base board, which takes the bill up to ~$15, still 3x our cost target. From a plug-and-play perspective, you’ll have to first spend time soldering on your own header strip, or pay a bit extra for one pre-soldered.

✅ Sense
✅ Affect

✅ Plug-and-Play
✅ Secure

🤯 Compute
✅ Connect

✅ Power
❌ Cost

Arduino and Adafruit

As we carve away ports and computer power that we don’t need, our next stop is the Arduino or Adafruit Feather range of boards.

Each of these ecosystems has a range of different connectivity options: boards that come with Wi-Fi, Bluetooth, LTE, NB-IoT, LoRA, or none of the above.

If we focus on the Wi-Fi based boards, the “Adafruit HUZZAH with ESP8266” starts to look interesting.

Adafruit Huzzah32

These things are small, at 51mm × 23mm.

They have Wi-Fi on board, a USB port for power and programming, a stack of GPIO pins, and an analogue input. There’s no additional SD card required: there’s flash memory on the board itself. You can buy a version with pre-soldered headers, making that plug-and-play scenario easier.

The compute is right sized for basic sensors: an 80MHz processor with 4MB of flash memory attached.

The only thing on here that’s a little bit extraneous for our scenario is the Li-Po battery connection. (That’s the black connector at the top corner, next to the micro-USB.) The board can both run off a Li-Po battery, and recharge one, as there’s a charging circuit built into it as well. But, for our scenario where we said permanent power was ok, the charging circuit just adds more cost to the board.

Unfortunately, the boards are still up around $20, which is 4x our cost target. There’s also a lot of exposed electronics, so we’d need to factor a case into the price yet.

✅ Sense
✅ Affect

✅ Plug-and-Play
✅ Secure

✅ Compute
✅ Connect

✅ Power
❌ Cost

Discovering the ESP

What we’ve looked at so far in this post are different boards.

As I was researching around different boards, I kept running into phrases like “Arduino compatible”, and references to ATmega, ESP8266, or ESP32 chipsets.

It starts to get interesting when you split up the concept of a board, versus a chipset.

The chipset is the processor at the core of each of these boards: the smarts that makes it tick. There are a small number of very popular chipsets. There are then lots of different boards that package these chipsets up with other supporting systems and accessories to make them more accessible: USB serial drivers, voltage regulators, battery charge circuits, different breakout headers, and different physical form factors. It’s these extra systems that drive the cost up, and the brand recognition of the boards that drives the margin.

After some very nerdy reading 🤓, I got quite excited by the ESP range, specifically the ESP8266 and ESP32 chipsets. It turns out these are very popular with makers and manufacturers alike because they hit an interesting sweet spot of cost and capability. If you’ve got a Wi-Fi enabled smart plug in your house, there’s a decent chance that it has an ESP inside it.

These chips were never really designed as a general-purpose compute unit: the ESP8266 came first, and it was intended as a Wi-Fi modem, to be added on to some other device. It has enough compute to run a full TCP/IP stack, some leftover generic pins, and the ability to re-flash it with new firmware. There was originally only Chinese-language documentation, but the low price point and interesting feature set led the maker community off into investigating it further and translating the documentation. The manufacturer – Espressif Systems – saw the opportunity and jumped right in with the release of official SDKs and English-language documentation.

The ESP8266 was succeeded in 2016 by the ESP32 series, upgrading it to a 32-bit dual-core processor, more memory, Bluetooth support, and a stack more peripheral interfaces.

Both chipsets now have an excellent ecosystem of software around them. Of particular interest to me was the ESPHome project: it’s specifically targeted at generating firmware for ESP-based chipsets, to integrate with a wide range of sensors, and then link all of this back to Home Assistant, which is what we’re already using as the integration layer for everything on the property.

Now I could re-focus the search for boards to be based around these chipsets.

Two compelling solutions came to light:

ESP-01S

These modules are positively tiny, at just 25mm × 15mm.

They’re an ESP8266 chipset, with just 8 pins exposed: power, serial comms (mainly useful for programming them), and two GPIO ports. That’s not very much exposed, but it’s just enough to get the job done when you only need to detect one thing and switch another.

ESP-01S module

At only ~$1.50 each, they’re at an incredibly compelling price point. That’s a custom programmable microcontroller, with Wi-Fi connectivity, for under $2. 😲

One minor annoyance is that they need to be powered by 3.3V, so you can’t just attach 5V from an old USB charger. There are however two very prevalent base boards: a relay board, and a temperature/humidity board. Each of these supports the ESP-01S just plugging into them, and can be powered by 5-12V. You can pickup the ESP+relay, or ESP+temperature combo for ~$3.

One frill they’re missing in their no-frills approach is any kind of USB serial driver. That just means you’ll need a separate programmer module for the first-time flash. Once you’ve flashed them once, you should be able to do future updates over-the-air.

ESP-01S USB programmer

For a good getting-started project, check out Frenck’s $2 smart doorbell.

✅ Sense
✅ Affect

✅ Plug-and-Play
✅ Secure

✅ Compute
✅ Connect

✅ Power
✅ Cost

M5Stack Atom Lite

These modules are the absolute winner: they hit the perfect sweet spot of capability, ease-of-use, and cost. They’re more like a ready-to-go compute module than a dangling microprocessor in need of a breadboard.

They have an ESP32 processor at their core, nine exposed GPIO ports, a button, an LED, and an IR LED. They’re housed in a nice little package, so you don’t have exposed boards. They’re powered and programmed by USB-C. All for ~$5.

M5Stack Atom Lite

The pin configuration is documented right there on the case, making them incredibly easy to wire up quickly. The only piece that’s missing is to know which GPIOs support analog inputs, but you can cross-reference them back to the ESP32 pinout for all the per-pin specs.

Many sensors can be connected directly to the Atom Lite: plug in +3.3V or +5V, GND, and a signal wire to a GPIO pin. For quick construction, wires can be pushed directly into the exposed header sockets. For a more robust connection, you can add a Grove/PH2.0-4P connector to your sensor, and then plug it into the port at the bottom there, next to the USB-C.

The LED is actually a “Neopixel” which means that while it only uses up a single GPIO, it’s a digitally addressable tri-colour LED. We’ve used this to provide a multi-colour indicator right there on the device for quick diagnostics in-the-field.

✅ Sense
✅ Affect

🤩 Plug-and-Play
✅ Secure

✅ Compute
✅ Connect

✅ Power
✅ Cost

Pre-built solutions: Shelly1, Sonoff, and more

The prevalence of ESP chipsets continues to pre-built solutions as well.

The Shelly1 is a 230v/16A-capable relay, designed to fit in to tiny spaces like behind an existing light switch panel. It can be wired in with an existing physical switch, so you still have the instant experience of a normal light switch, but now it can be Wi-Fi monitored and controlled as well. Here it is with an Internationally Standard Sized Oreo for scale:

Shelly1 on an Oreo

At its core: an ESP8266. And they even expose the programming headers for you right there on the case:

Shelly 1 pinout diagram

They’re Australian certified, but you’ll still need a sparky to actually install them for you if they’re hooked up to your 240v circuits. For lower voltage circuits – like switching irrigation – you can wire them up yourself with a screwdriver and zero soldering.

It’s a similar story for most of the Sonoff smart switches, and a stack of other manufactured devices.

I’ve used some Sonoff devices with good results, but having discovered the Shelly range via Troy Hunt, I’m keen to get my hands on that form factor next.

Implementation

ESP-based modules now litter the property.

Here, an M5Stack Atom Lite ($6) is combined with a DS18B20 temperature probe (~$2, including 2m cable), a single resistor jammed across two pins, a leftover USB charge cable, and a little bit of electrical tape to hold it all together. For sub-$10, and no soldering, we have a Wi-Fi connected temperature sensor dipped into the hot water system, connected back to Home Assistant.

A similar setup connects to a flow meter in-line with a pressure pump. The flow meter draws power (3.3V and ground) straight off the M5Stack. For each litre of water that passes through, it’ll pulse 11 times. A resistor is jammed across the pulse return and ground to remove float noise. The flow sensor can cost anywhere from $5 to $500 depending on what pressure you want to be able to handle, for what type of fluid, and to what accuracy. Ours cost ~$20, so the whole setup was <$30. It’s not the nicest fanciest of engineering, but it was zero soldering, and it hasn’t missed a pulse yet.

ESP-01S modules with corresponding relays litter Dad’s desk by the bundle. Right now, he’s building out a relay control, with active feedback, for an otherwise dumb valve. It’s a complex valve that takes 15 seconds to move fully from one state to another, due to the pressure it can handle. This is a perfect scenario for local logic: we can use an ESP-01S to drive the valve for the right amount of time, and then validate the expected state against the feedback wire. A dumb valve becomes a smart valve for <$10.

At home in the city, I’ve used an ESP-01S module to retrofit LED Christmas lights with a new sparkle-effect and Wi-Fi driven timing controls, based off the shifting time of sunset each day:

And to bring my very-90s gate intercom handset into the world of mobile push notifications, whilst hiding the electronics within the existing handset:

Getting Started

Your shopping list:

  1. A few M5Stack Atom Lite
  2. A bundle of Grove cables
  3. A bundle of mixed resistors (for pull-up/pull-down on data channels)
  4. Your choice of sensors
  5. The cheapest, most-basic USB-C cables you can find

Program them with either the Arduino ecosystem, or ESPHome (my preferred approach). There are a stack of example ESP projects on Instructables.

UPDATE: I’ve published the next post in this series, which will take you from nothing-to-firmware in 30 minutes using ESPHome.

Have fun!

✍ Friday is your timeline

It’s Monday, and you’ve picked something to work on: we’re finally going to get that new policy launched internally. We just need to write up the post and get it out to the team. It’ll be done and live by tomorrow. Maybe Wednesday at the latest.

Hold on; we missed a scenario. Alex can help fix that. Got five Alex?

Great chat! That really helped, and thanks for pointing out those other stakeholders we missed. We’ll check in with them quickly.

It’s Friday morning. They’ve found a gap. It’s annoying, but it’s good that we found it, and we can fix it quickly this morning.

Hey folks! I think we’re ready to go, yeah? We really need to get this out this week!

It’s now 2pm Friday.

Stop. Don’t publish.


Work can be like a gas: it expands to fill the container. The easiest scheduling container is a working week: it starts fresh on a Monday with a new burst of optimism, then everyone’s mutual optimism collides into a ball of messy work, and finally culminates with everybody wanting to feel successful before closing off for the week. It’s nice to tick off the ‘simple’ things you set out to achieve at the start of the week, especially if they evolved into something not-so-simple.

There are very few things that make sense to publish after noon on a Friday.

Your end-of-week rush collides with everyone else’s. When most people are also in output/closing mode, it’s hard to effectively inject new input.

Your comms won’t be perfectly contained or reaction-less. Hopefully they’ll be quite the opposite! People will have questions or feedback. They’ll have scenarios you didn’t contemplate. All of these are ok, if you’re able to respond effectively. You could leave the responses until Monday, but that’s a bit of a dump-and-run that pays for your good-feels (yay, we published!) with other peoples’ unnecessary stress or anxiety over a weekend (but how does this affect me?).


Work on your timeline, but publish to your audience’s timeline. Friday was your timeline; it’s probably not theirs.

Publish when your audience will be most receptive to new input. In a corporate environment, that’s typically earlier in the week, not later.

Think of a post/publish/send button more like a ‘Start Conversation’ button. Press it when you’ve actually got a chance of sticking around and engaging.

Finish your week comfortable that you’ve already got the first set of steps sorted for Monday: all you have to do is hit publish. That’s like two wins for the price of one: finishing the week with everything done and sorted, and starting Monday straight out of the gates.

✍ As you may know…

Announcements to a company, team, or project rarely occur in complete isolation, and thus typically include multiple references to prior context.

Examples:

  • “As you may know, embracing diversity is one of our company’s core beliefs.”
  • “You will recall that Alex wrote last month about the enterprise planning process.”
  • “As you know, we do this through the Figwizzlygig process.”

I frequently see such context surrounded in the filler phrase of “As you may know”, usually from a desire to avoid repetition of information that the audience already knows, but then introducing that information anyway.

I always advocate for dropping those filler words. Here’s why:

“As you may know, embracing diversity is one of our company’s core beliefs.”

💡 “Embracing diversity is one of our company’s core beliefs.”

If it’s a core belief, take the opportunity to state it. It’s worth saying again whether the audience knows it or not.

“You will recall that Alex wrote last month about the enterprise planning process.”

If I do recall, the phrase is redundant: it becomes more of a trigger to tune out, as the author has just confirmed that the following information is redundant.

If I don’t recall, then we’re jumped straight to implied fault: my memory isn’t good enough, or I wasn’t looking in the right place. There are many other plausible scenarios, which aren’t the reader’s fault; to start, they might be new to the group/project/company and never been in the audience for the previous communication. Whatever the case, avoid the unnecessary implied accusation.

💡 “Last month, Alex wrote about the enterprise planning process.”

Changing to a straight-up statement links the context to the prior communication, without any of that other baggage.

💡 “Last month, Alex wrote about the enterprise planning process.”

Better yet, link to that prior communication. Tools like Yammer, Teams, and Slack all provide the ability to link to a previous thread. This gives the reader a one-click jump back to that important context. Whether they’re new to the audience, or just want to brush up on history, the reader can continue to hop back from one communication to the next. They’ll be able to read the communication, and the resulting replies/reactions.

If you’re stuck referencing back to an email, attach it. For the recipients who never previously received it, the attachment removes the hurdle of needing to ask for that content, leaving them better informed, and you needing to write less duplicated follow-ups. For the recipients who want to go back and re-read the context themselves, it’s now one click away, instead of a search-mission through their inbox. Making the context proactively available helps underline the importance of it, and better respects the readers’ time, especially in aggregate across a large group. You likely already have the original email open yourself, as part of checking that your reference makes sense.

“As you know, we do this through the Figwizzlygig process.”

This is another opportunity to lead the reader towards success and test the availability of information in the process.

Where a process or tool is important to an organisation, it should be well documented, and readily discoverable. Intranet > search > first result > #winning. For many organisations, this is a struggle between the content existing in the first place, it being published somewhere linkable, and then the search/discovery process being up to scratch. Whilst these can add up to seem insurmountable, announcements are a great time to chip away at them: the content you’re talking about is likely the most important to have available.

First up, is the Figwizzlygig process well-defined and documented? When somebody asks to know more, is there something ready to share? If you’re expecting other people to know the content, you should be confident in this. Now’s a great time to check.

Does that content live somewhere accessible to the audience, with a URL? Nobody wants to be trying to get something done but be left second-guessing whether the PDF they were once sent is the latest version. Now’s a great time to check.

💡 “We do this through the Figwizzlygig process.”

If you can find the link, include it.

If you struggle, then you’ve identified a gap.

“As you may know, as you may know, as you may know”

When so many announcements start with context in this format, it also just gets down right repetitive. Take a look at some recent announcements you’ve received or written, and consider how much opening impact was wasted on the phrase “As you may know”.

Other Reactions

Since originally publishing this post:

AAD: app secrets, API-only access, and consent

At Readify yesterday, I saw two different co-workers encounter the same issue within a few hours of each other. Time for a blog post!

Scenario

Problem 1:

I’m trying to use Power BI AAD App Registration to access Power BI REST API using ClientId and Key. The idea is to automate Power BI publishing activities through VSTS without a need of using my OrgId credentials. Pretty much I follow this guide but in PowerShell. The registration through https://dev.powerbi.com/apps sets the app registration up and configures API scopes. I’m able to get a JWT but when I call any of Power BI endpoints I’m getting 401 Unauthorised response. I’ve checked JWT and it’s valid but much different from the one I’m getting using my OrgId.

Problem 2:

Anyone familiar with App Registrations in Azure? Specifically I’m trying to figure out how to call the Graph API (using AppID and secret key generated on the portal), to query for any user’s group memberships. We already have an app registration that works. I’m trying to replicate what it’s doing. The end result is that while I can obtain a token for my new registration, trying to access group membership gives an unauthorized error. I have tried checking every permission under delegated permissions but it doesn’t seem to do the trick.

They were both trying to:

  1. Create an app registration in AAD
  2. Grab the client ID and secret
  3. Immediately use these secrets to make API calls as the application (not on behalf of a user)

Base Principles

App Registrations

AAD has the concept of App Registrations. These are essentially a manifest file that describes what an application is, what its endpoints are, and what permissions it needs to operate.

It’s easiest to think of app registrations from the perspective of a multi-tenant SaaS app. There’s an organisation who publishes an application, then multiple organisations who use that app. App registrations are the publisher side of this story. They’re like the marketplace listing for the app.

App registrations are made/hosted against a tenant which represents the publisher of the app. For example, “MegaAwesome App” by Readify would have its app registration in the readify.onmicrosoft.com directory. This is a single, global registration for this app across all of AAD, regardless of how many organisations use the app.

App registrations are primarily managed via https://portal.azure.com or https://aad.portal.azure.com. There’s also a somewhat simplified interface at https://apps.dev.microsoft.com, and some workload-specific ones like https://dev.powerbi.com/apps. They’re all just different UIs on top of the same registration store.

Enterprise Apps

These are like ‘instances’ of the application, or ‘subscriptions’. In the multi-tenant SaaS app scenario, each consumer gets an Enterprise App defined.

The enterprise app entry describes whether the specific app is even approved for use or not, what permissions have actually been granted, and which users or groups have been assigned access.

  • Publisher: readify.onmicrosoft.com
    • App Registration for ‘MegaAwesome App’
      • Defined by development team
      • Describes name, logo, endpoints, required permissions
  • Subscriber: contoso.onmicrosoft.com
    • Enterprise App for ‘MegaAwesome App by Readify’
      • Acknowledges that Contoso uses the app
      • Controlled by Contoso’s IT admins
      • Grants permission for the app to access specific Contoso resources
      • Grants permission for specific Contoso users/groups to access the app
      • Defines access requirements for how Contoso users access the app (i.e. Conditional Access rules around MFA, device registration, and MDM compliance)
      • Might override the app name or logo, to rebrand how it displays in navigation experiences like https://myapps.microsoft.com.

If we take the multi-tenant SaaS scenario away, and just focus on an internal app in our own org, all we do is put both entries in the same tenant:

  • Org: readify.onmicrosoft.com
    • App Registration for ‘MegaAwesome App’
      • Defined by development team
    • Enterprise App for ‘MegaAwesome App by Readify’
      • Controlled by Readify’s IT admins (not dev team)

The App Registration and the Enterprise App then represent the internal split between the dev team and the IT/security team who own the directory.

Consent

This is (mostly) how an enterprise app instance gets created.

The first time an app is used by a subscriber tenant, the enterprise app entry is created. Some form of consent is required before the app actually gets any permissions to that subscriber tenant though.

Depending on the permissions requested in the app registration or login flow, consent might come from the end user, or might require a tenant admin.

In an interactive / web-based login flow, the user will see the consent prompt after the sign-in screen, but before they’re redirected back to the app.

Our Problem

In the scenario that both of my co-workers were hitting, they had:

  1. Created an app registration
  2. Grabbed the app’s client ID and secret
  3. Tried to make an API call using those values
  4. Failed with a 401 Unauthorised response

Because they weren’t redirecting a user off to the login endpoint, there was no user-interactive login flow, and thus no opportunity for the enterprise app entry to be created or for consent to be provided.

Basic Solution

You can jump straight to the consent prompt via this super-easy to remember, magical URL:

https://login.microsoftonline.com/
{TenantDomain}
/oauth2/authorize
?client_id={AadClientId}
&response_type=code
&redirect_uri=https://readify.net
&nonce=doesntmatter
&resource={ResourceUri}
&prompt=admin_consent

You fill in the values for your tenant, app client ID, and requested resource ID, then just visit this URL in a browser once. The redirect URI and nonce don’t matter as it’s only yourself being redirect there after the consent has been granted.

For example:

https://login.microsoftonline.com/
readify.onmicrosoft.com
/oauth2/authorize
?client_id=ab4682b39bc...
&response_type=code
&redirect_uri=https://readify.net
&nonce=doesntmatter
&resource=https://graph.microsoft.com
&prompt=admin_consent

Better Solution

Requiring a user to visit a one-time magical URL to setup an app is prone to failure. Somebody will inevitably want to deploy the app/script to another environment, or change a permission, and then wonder why everything is broken even though the app registrations are exactly the same.

In scripts that rely on app-based authentication, I like to include a self-test for each resource. This self-test does a basic read-only API call to assert that the required permissions are there, then provides a useful error if they aren’t, including a pre-built consent URL. Anybody running the script or reviewing logs can browse straight to the link without needing to understand the full details of what we’ve just been through earlier in this post.

Preferred Solutions

Where possible, act on behalf of a user rather than using the generic app secret to make API calls. This makes for an easier consent flow in most cases, and gives a better audit log of who’s done what rather than just which app did something.

Further, try to avoid actually storing the app client ID and secret anywhere. They become another magical set of credentials that aren’t attributed to any particular user, and that don’t get rotated with any real frequency. To bootstrap them into your app, rather than storing them in config, look at solutions like Managed Service Identity. This lets AAD manage the service principal and inject it into your app’s configuration context at runtime.

Other Resources

https://docs.microsoft.com/en-us/azure/active-directory/develop/active-directory-application-objects

Hassle-Free Screencast Recording with Windows 10

Short screencasts are a key technique in my overall communication toolbox.

I like to use them to communicate complex messages, where the background is often even more important than the actual outcome.

They allow us to share complex ideas in a way that people can consume asynchronously. As an example, right now we’re rolling out our Readify FY18 plans as a series of 10-minute chapters.

They also prevent people from skim-reading and missing context. I design the length of the video to be the time investment required of somebody. (Here’s an idea. Want an opinion? You need to invest at least X minutes of your time to hear it out first. Otherwise, you’re wasting both of our time.) Typically 5 – 10 minutes are good sized chunks. If somebody comments “I haven’t watched the video, but“, then I’m quite comfortable to suggest that I’ll give them the 5 – 10 minutes to go and actually watch the video, and then I’ll be willing to discuss it when they’re done.

Of course, screencasts are just one communication technique. Earlier in the FY18 planning process, when there were ideas flying everywhere, we shared the content as mostly raw meeting notes and bullet point dumps. We shared with the expectation that people would skim-read at best, and that that was ok: they would call out the areas they wanted to talk about in any detail.

Previous Tools

PowerPoint’s own recording tools aren’t great. They’re designed for narrated slides, rather than a recorded presentation, and there are some subtle but awkward differences there. The prime one is that they store the audio track as a separate recording on each slide, resulting in a silent gap at each slide transition. I usually talk over the slide transitions to bridge the content, which ends up sounding terrible. It also trashes your PPTX file by putting auto-advance time stamps all through it (because that’s how they move to the next audio block).

I used to use a tool called SnagIt. It was nice and simple, with a reasonable price point (~$50), however it hasn’t progressed to the world of 4K screens. On a Surface Pro 4, this means you’re stuck with only ¼ of your screen fitting in the maximum viewport, or you have to stuff around with changing screen resolution every time you want to record.

Native Windows 10 Tools

With Windows 10, you can now produce a decent output just using built-in tools in the OS. Some of the features aren’t so obvious though, hence this post to share how I stitch them all together

In most cases, we’ll only be recording a single app – like PowerPoint, or a web browser. These instructions assume that’s the case. If you want to toggle back and forth between multiple apps, then you’ll need to fall back to a third party app like SnagIt or Camtasia.

💭 Give these steps a go as you read through this post. You’ll just record the browser here, but you’ll get to see how it all hangs together.

Quiet Hours

The last thing you need is an awkward IM during your recording. Even if it’s completely innocuous, it’ll make you stumble mid-sentence. Turn on Quiet Hours to silence those notifications.

Quiet Hours.png

You’ll still need to exit non-Store apps, like Outlook and Skype of Business. Until they’re upgraded to the world of UWP, they just blissfully ignore this setting.

Game Bar

Windows 10 includes a built-in Game DVR as part of the Xbox integration. We’re going to trick that into being more broadly useful for us.

First, launch the app you want to record, such as PowerPoint.

Then hit ⌨ Win+G to launch the Xbox Game bar. (G is for Game.)

The first time you do this, poor old Windows is rightly going to be a little confused about you launching the Xbox controls over the top of PowerPoint:

Game bar prompt

Just tick that box to say “Yes, this is a game”, and you’ll get the full set of controls:

Game bar full

And now you know how I unwind of an evening. PowerPoint! Woo!

Start Recording

You’ll need to explicitly tick the “Record mic” box each time, as it’s normally optimised just to record the game audio, and not the user.

Then, just hit the big, red, record button. 🔴

The Game bar won’t be part of the recording, so don’t worry if it sticks around on the screen.

Sometimes, the “Record mic” box unchecks itself again after you start recording. Double check that it’s still checked (filled in) before you power into your recording.

Clear Audio

We’re not aiming for TV broadcast quality here, but we do want to get decent audio that’s not going to annoy listeners. With a few simple tips, you can usually achieve that without buying a special mic or stapling egg cartons to the wall.

Aim for a larger room, preferably with softer furnishings. In an office, this is probably a larger boardroom with the blinds down. At home, it’s probably your bedroom (seriously, the blankets and pillows are perfect!). A small meeting room isn’t a deal breaker, but it’ll definitely sound like you’re talking in a bucket.

Start off by clearing your throat, then saying something at a normal volume. “Hi, I’m talking to myself”. Pause for a few seconds, then start your actual content. This technique will cause the automatic volume levelling on your mic to kick in and sort itself out. We’ll trim this off later.

Sit central to the laptop that you’re recording on. Most modern laptops have stereo microphones in them, so if you talk at it side-on then your audience will only hear you side-on.

Keep the energy up. The normal pacing techniques that you’d use for a live presentation don’t apply here; they just leave people to get distracted. If it feels like you’re racing through your content and talking too fast, then it’s probably about right.

Stop Recording

Leave at last a few seconds of silence at the end. There’s no rush to stop the recording, because we’ll trim the end anyway.

If the Game bar is still on screen, you can just click the Stop button.

If it disappeared, press Win+G again to bring it back first.

Recording Location

You’ll find the recording under Videos, then Captures.

My Computer Videos

While it was recording, Windows was just focussed on capturing everything really quickly. It didn’t want to use up CPU cycles that might cause a hiccup to the game that we were playing. As a result, the file will be huge, because it hasn’t been optimised at all. We’ll fix that next.

Trim + Optimize

Right click the video, then Open With > Photos. Yes, really, the Photos app, not the Film & TV app. That’s because the Photos app includes a video trimming tool:

Photos Trim.png

Even the smallest amount of trimming will let you then save a new copy of the video:

Photos Save a Copy

This time, Windows will work harder to generate a much smaller file. For recordings of slides and apps, you’ll generally see the file size reduce by 95% or more, which makes it immensely easier to distribute.

The smaller file will appear in the Captures folder, next to the original.

Three Takes Tops

Screencasts should be easy and natural for you to produce, so don’t make them a big event.

I like to restrict myself to three takes max:

  1. The first one is just a test. I know I’ll throw it away. It’s useful to test the audio, mumble my way through the slides, and think about what I’m going to say.
  2. The second take might be real. I try for it to be real. If I say ‘um’ or ‘ah’, it’s ok – it’s natural speaking style – and I keep going. I definitely try not to say ‘oh crap, I totally stuffed it’ because that just completely trashes the recording and forces you to restart. If this take is good enough, then we’re done and dusted. More often than not though, I stuff this one up majorly by getting slides in the wrong order, or getting my message wrong.
  3. The third take must be real. Any more effort than this is too much effort.

This means than a 10 minutes presentation should take ~30 mins to record. I usually book out an hour, so that I then have time left to upload the recording to Office 365 Video and post it out to Yammer.

No doubt, your first few attempts will take a bit longer while you’re learning both the tools and the style. That’s ok; just avoid getting stuck in infinite takes. Once you hit about five takes in a single sitting, it’s time to pack it up and take a break. Your voice will need a rest, and you’ll likely be muddled up about what points you have or haven’t said in any given take.

Disable Game Bar

While the game mode is enabled, even when you’re not recording, PowerPoint is running in a bit of an odd state that it’s not used to. Your whole PC will probably feel a bit sluggish.

To disable it again:

  1. Return to PowerPoint (or whatever app you were recording)
  2. Hit ⌨ Win+G to launch the Game bar again
  3. Click Settings
  4. Untick “Remember this as a game”

Then, your PC will be as good as it was to begin with.

Companion planting for the optimal garden, in Windows 10

We’re in the transition seasons in both hemispheres right now: autumn in the south, and spring in the north. This is a good time to establish a new crop of plants before the conditions get too harsh in the peak seasons.

In our house, we wanted to replace the under-loved front courtyard with a basic vegetable garden that will produce some winter greens. We’re only talking about a small urban space here, but it’s amazing how much you can produce from that, and just how much it improves the look of the space.

First, we built a simple raised bed: 1.8m x 1.4m, and around 20cm deep. Minimal tools were required, as the hardware store cut the wood to size for us, so we just had to screw some brackets into each corner and dig them in with a basic hand trowel. We covered the existing dirt with some soaked cardboard as a weed and nutrient barrier before loading in the new potting mix (80%) and manure (20%).

2017-02-05 17.33.29.jpg

The next challenge was to work out what plants we wanted. We had an idea – leafy winter greens – however garden bed planning always runs into a challenge when you consider companions and enemies. Companion planting is especially important in shared beds, where plants can compete with each other, send each other out of balance, or strive for success together.

This process has always been quite manual and annoying. As soon as you start reading about one plant, you’ll quickly find that it’s not compatible with something else you had planned, and it’s back to rearranging everything again. My mother has slowly compiled the Excel-sheet-to-end-all-Excel-sheets, saving on browser-tab-fatigue, however it’s still a laborious process to apply to a brand new garden. (And that’s if you even know everything you want to plant in the first place!)

Of course, the solution was to pause here and build a simple Windows 10 app:

2017-02-27 Screenshot v1.2

Get it on Windows 10

As you drag-and-drop plants onto the bed planner, the app constantly recalculates what will and won’t be compatible.

The list of potential plants is automatically sorted to hint “Great Companions” and “Good Companions” first, and push those sneaky enemies to the bottom of the queue.

This also means that you can use it somewhat like Spotify Radio: just pick one plant you really want (say, basil), and drag it on to the bed planner. The list of potential plants will instantly suggest capsicum or tomatoes as the ideal plants to add next. Just keep adding from the top of the list and you’ll have a perfect garden in no time.

It also renders basic sizing and spacing information, so you can get an idea of what will actually fit in your space.

With the app built, our optimum winter garden is now well on its way to success. Hopefully yours can be too!

2017-02-25 18.12.02

⚠ Action Required: Revisit your Skype account security

The Really Really Short Version

If you don’t have time to read the full post below, these are the minimum steps you should follow to secure your Skype account:

  1. Visit https://account.microsoft.com
  2. If already signed in, sign out
  3. Sign in with your Skype account (old Skype username, not email or phone number)
  4. Follow the bouncing ball to completion

For the most complete fix, and a little background, read on.

Continue reading “⚠ Action Required: Revisit your Skype account security”

How Azure Directory (AAD) and VS Team Services (VSTS) relate to Azure Subscriptions

I recently received a question along the lines of “I’m a co-admin on an Azure subscription, so I should have access to everything, but I can’t modify the directory. Why not?”

Here was my answer:


Azure Subscriptions are a billing and security container.

Azure Subscriptions contain resources like blob containers, VMs, etc.

Azure Directory is an identity container.

Azure Directories can:

  • Define a user account (org id)
  • Reference a user account from another directory (foreign principal org id)
  • Reference a Microsoft Account

An Azure Directory is not a resource. It lives outside the subscription, on its own.

Every Azure Subscription is linked to an Azure Directory as the place that it reads its identities from.

“Co-owner” is a security role applied at the Azure Subscription level, granting control of the resources within that subscription.

Generally, we should be granting less co-admin rights in the old portal, and focussing on RBAC-based grants in the new portal instead. (They’re more finely grained, to a specific resource, resource group, or set of actions.)

Because an Azure Directory is not a resource, and does not live in the subscription, the co-owner concept does not apply to it.

“User administrator” and “Global administrator” are security roles applied at the Azure Directory level. They relate only to the directory, and not any linked subscriptions.

VSTS Accounts are another stand-alone entity.

A VSTS Account can be linked to an Azure Subscription, so that charges can flow to the Azure subscription. If it is not linked, then there’s no way to use any paid services (as VSTS doesn’t have its own commerce system).

A VSTS Account can be linked to an Azure Directory. This is essentially like “domain joining” a PC; it opts you into a number of management advantages. If it is not linked, then you can only use Microsoft Accounts for sign-in, and it essentially maintains its own lightweight identity store in lieu of the directory.

All Azure Subscriptions are part of an Azure Account. This is where the billing information is maintained.

All Azure Accounts have a Service Administrator and an Account Owner. These are security roles applied at the Account level. They do not grant any rights into the subscriptions, directories, or VSTS accounts (as they are all different, independent entities).


When you login to https://portal.azure.com, you login with an identity that’s in the context of a directory. You can see your current directory context top-right. You will see the different resources which are within subscriptions that are linked to your current directory. You may have no subscriptions at all, in which case you just see the directory but an otherwise empty portal.

When you login to https://manage.windowsazure.com, you must always be in the context of a subscription. (Old portal, old rules.) You will see all of the directories that you have access to as a user, regardless of which subscription context you’re in. Even if you have access to a directory, but you are just lacking a subscription, they will boot you out of the portal with an error about having no subscriptions. To work around this, we grant everybody at Readify co-admin access to an “Authentication Helper” subscription. It’s empty, but it lets you login with your OrgId and then swap to the other directory that you were actually looking for. I really dislike the old portal.


Clear as mud? 🙂

Software still at the heart of IoT

Earlier today, I was quoted in Drew Turney‘s Tech giants get ready for Internet of Things operating systems article for The Age.

The article explores the relevance of ‘dedicated’ IoT systems, like GE’s Predix.

I’d like to expand on this quote:

“The opportunity of IoT lies in integrating physical intelligence right through to business processes, and back out again”

Much of the current discussion around IoT is focussed on cheap sensors, platform interoperability, and data analytics. These are all important building blocks, but they don’t really talk to the power of IoT for me.

We’ve spent two decades mashing up databases. Disconnected datasets now annoy even our least technical friends.

We spent the last decade mashing up web services. It’s mind boggling that I can add a high-quality, interactive map with global coverage straight into an app, and then spend longer trying to generate all the different icon sizes required to publish it.

We’ll spend this decade mashing up the physical world. We’re nearing the point that it’s as easy to connect to your toothbrush as it is to connect to a web service.

Software remains at the heart of all this: it’s just that we can now reach further than ever before. Rather than waiting for input data, we can just go and get it. Rather than sending an alert, we can just go and start/stop/repair/move/etc. whatever we need to.

Separately, it was encouraging to see security raised several times. A device that’s too small and dumb to run the math required for encryption is probably not something to be exposed to the public internet.

And of course, it’s always nice to see Readify’s name alongside the likes of Intel, GE, and CSIRO. 🙂