Last week we introduced the Calibur pocket box. It is once again a major leap in product design and features. Testing it is the last step before actual production.
The plan is something like this: 1.Build a very reliable, stable and accurate device 2. Use that basis to build on all the smart features in the application.
We ran a broad épée test back in November which served as a way to pinpoint the weaknesses, analyze the data and incorporate it into a new device. It’s worth summarizing the major takeaways. Today we’ll cover technical issues and how we addressed them:
Let’s jump start to connectivity. Connectivity is the bread and butter of a wireless system. You can only make wireless as good as the quality of connection is. It is judged by 2 factors: stability and latency. To put it more simply how often the connection drops and how much time it takes for a hit to display. The boxes we shipped in November had inconsistencies in both fields, so we put a lot of effort in fixing it.
A simple way to measure it is how far you can get without disconnecting? Mark a 15 meters strip every 5 meters (having the scoring at the middle of the strip means not any of the fencers would get farther away than 7m and in most cases, even much less), put the devices at each mark and note the results, then repeat with the device in the backpocket, then again while covering it, then with walls in between, then underwater in a tin suit… on each and every phone.
In general Apple devices disconnected much sooner than the ones with Android, but we were able to improve on both systems. The development is pretty obvious on this part, we were able to drive down drops to practically 0. Tested on a dozen different, low-end to high-end phones and tablets connection always remained active. The challenge is now to find an otherwise well-functioning phone that fails this test and I’m happy to tell that we haven't find one.
The latency is a bit more complex issue. How long is good enough actually? We found that the threshold for unnoticeable delay to be around 100ms.
Latency consists of 2 parts: how long does it take to send the data and how long does it take for the phone to process it. Let’s check the first part, so back to the marked strip. Put 2 synchronous clocks to the pocket box and the phone and repeat the stability test but this time at each mark sending the same data packet.
The results are again very promising. The delay dropped significantly and stayed low up until around 15 meters. So now it is up to the phone to process it.
Getting the phone's processing time down proved to be harder to optimise than anticipated. But it’s clear that the delay is mainly caused by the application now which will perform much better after restructuring. More on that later.
The idea that accuracy is based on the utilization of the data pool is a sort of chicken-egg problem. To have people using the devices they need to be enjoyable. To be enjoyable people need to use it. So the deal with the broad épée test was that people would be using them even when it’s not so enjoyable and we will quickly follow-on with updates and by the end of the test they will become quasi-replacement for any system on épée.
After a very strong start, restrictions started to kick in in December basically everywhere and the incoming data started to slow down. But we still managed to get thousands of bouts and tens of thousands of touch data. Huge thanks to our testers for keeping up despite all the hurdles! Two things became clear: 1. That the data will be not sufficiently big with the current model (it's not just a matter of the total volume but the constant flow, to make comparison after every version) and 2. that we certainly underestimated how much work it will be to maintain a system and to develop a new one parallel. So we moved forward and put the focus on incorporating the existing data into the next version.
The goal was to make accuracy high enough so that fencers will enjoy using the devices in training sessions and the rest of the data will come much easier. The 2 major questions in accuracy are: does it go off when it shouldn’t and do we need calibration to avoid that? We were able to improve tremendously on both aspects.
How-it-works grey box:This is actually the most complicated part. The touches are validated by a model dependent on the electric properties on the weapon. To put it simply we measure these properties, gather them in the cloud where machine learning algorithms process them. Then the validation model is updated based on the new data and feeded back to devices through the application. Instead of measuring one property in the previous iteration now we do 3 and the machine learning fetches those together.
TLDR: The general direction is very clear latency is down, accuracy is up. We are getting through the 3rd wave (so far) of the pandemic and with local clubs shut down, it's more tricky to find the ways for sufficient testing. In our testing the system works well in 90-95% of the cases. So the job for the AI is to close the remaining gap particularly on rare cases. Global testing will start next month with the goal of finding the outlier cases. As soon as the data feed start to improve again so will the system.
In the next posts we will cover the general feedback and future features but for ending watch this video of a short bouting in the office.
Product evolution #2
We nicknamed the project “wire eater” among ourselves so I will refer to the devices asWE versions. The plan was set in motion: ship WE-1, incorporate feedback and develop WE-2 within 3 months. The goals for WE-2: make the app cross-platform (Android and iOS), get bellguard-grounding-accuracy to 90-95% on épée and deliver over 100 devices for the clubs to test. In other words a broad beta test for WE-2. We planned the testing to take place in November.
The sprint started by mid-August and the first clubs buying WE-1 planned to restart fencing in September. We agreed to deliver for the reopening. For the first few weeks development and production went simultaneously, but delivering a product for actual customers was very exciting. We finished just in time:
Some pictures from a training session @ PSE. The kids had fun, and we gathered valuable insights.
Néhány kép a PSE edzéséről. A gyerekek jól érezték magukat mi pedig sűrűn jegyzeteltünk.
The kids intuitively started to use our products really enjoyedthemselves
In the meantime we made a detailed roadmap for WE-2 with the 3 goals in mind as above. Let’s get through them one by one.
Making the app cross platform
It shouldn’t take particularly precise planning. We take what we have for Android and replicate it for iPhones, right? Well, Apple strictly controls everything and why wouldn’t that be true for wireless devices. If we want to connect something to iPhones we need to use a wireless chip approved by Apple. If you had to guess whether we used one like that or not, where would you put your money, and why on not? WE-1 only supports épée and does not have any grounding capacity, but it has a very stable and fast connection with the phones. That part was fine tuned already. Changing the chip means to throw all that away, and restart.
Getting accuracy over 95%
Bellguard grounding should work 9+ times out of 10 in test environment. Our model is based on that a larger data pool is needed to operate. How would we get that? Obviously it’s not an option to gather that at the clubs every time we change something. So a lot of nights went like this:
999 green bottles on the wall, if one green bottle should accidentally fall, 998 green bottles on the wall…
Both of the goals turned out to be an uphill battle especially in the given timeframe. All the new wireless chips turned out to be either too slow or dropped connection too easily with the phones. Every time we tested out something in the lab and took it to a real training session it completely failed. Special thanks to OSC and their fencers for letting us do our it-worked-yesterday-I-swear show twice a week. The connection drop proved to be a very stubborn issue, basically disrupting every other test we tried to run.
OSC helped a lot, sharing their space to test
Producing 100 pieces
We were approaching the end of October. The clubs who signed up for testing were all set, waiting for their devices, we just finished the new enclosure design and for the 1st time the application was approved on both platforms.
It's much more convenient to connect the yellow penguin than a QR code
But the connection issues still stayed with us. We had only a week left to finish and we put our bets on a last resort solution in changing the antenna design. We urgently needed a plan B and getting the devices out of the pockets made the connection slightly better.
Desperate times call desperate measures
I shouldn’t get into details about the things we tried on the last weekend in a rush to find a way to attach them to the body. The new antenna design, though far from perfect proved to be above the threshold and we decided to give it a go.
We’ve made everything ready and after an intensely laborious week with 3 hours of sleep on average we tested, packed and labelled all the packages ready for shipping. And the épée beta started.
Final touches before sending out beta test packages. (Nov.11.)
Utolsó simítások mielőtt kiküldtük a beta teszt csomagokat.
As I mentioned, I think it’s worth getting a brief look back on the evolution how we got towhere we are today. Picking up the story from where we left off: Our team had just formed. More or less with the skillset diverse enough to build a hardware we had a lot of questions. What should we focus on? What are the milestones to set? Who are the customers?
We decided that we should enter a startup competition and present the concept to a jury. It would provide 2 things for the project: Firstly, it’s an external commitment that forces us to build and deliver something in time. Secondly, there will be some guiding feedback. So we did exactly that, we built “something”.
Meet the “thing”, our very first prototype. Banana for scale
I shouldn’t get into nuances, like the main board being cardboard or the fine crafted details of the blue duct tape keeping the whole thing together. However it did prove a few things. It was able to send a signal to a phone whenever the tip was pushed. In other words: we’ve built a scoring machine on very basic terms. And it was enough to push us to the second round of the competition. More importantly, what we’ve learnt from this experience is that you’ll always have additional features or fixes you want to finish before you “launch”, but actually putting something out there is really what starts momentum.
So preparing for the second round we needed to “commercialize” the thing. Shrinking it in size to fit in a box and basically to be usable for some sort of fencing. We’ve actually designed 2 models for this round. The better one was much smaller, had a rechargeable battery and had far better chips on it however it was so unstable we used the more basic but stable one. (Note that by stable I mean we needed to connect a headset to the phone because the device randomly scored sometimes. So the plan was to disconnect the headset while presenting, pray that it wouldn’t make any sound randomly, then reconnect the headset when we were done. Please don’t tell the jury).
Somewhere in the distance an industrial designer starts to cry
Our chances were weaker than the quick binder on the money clip holding the box together, but it worked! Or at least did not fail during the presentation. The performance earned us third place in the competition which gave us a great push. More importantly we had something that we could start to test on actual fencers.
Real life testing
So the next months were spent on incremental improvements, shrinking the size, going to fencing clubs and collecting feedback.
We visited 2 clubs a week, sometimes even more. Fortunately everyone was very welcoming and the general feedback was very enthusiastic. We got proof that the old way of scoring was really the pain that we understood it to be. Our two main goals were set:to shrink it to the size of 2 matchboxes, and to lay the ground for data gathering. We started to test basic AI models on the data. We went through a handful of iterations not very different from each other.
I'm glad we left the small coffin design on the left
Even though the app was only for Android it only worked for épée and no grounding feature was available, we had our first customer. A small club next to Budapest decided to order 10 pieces. Sadly we didn’t have too much time to celebrate as it coincided with the worldwide spread of Covid-19. With the clubs closed and most of the indoor activities restricted we didn’t see any other option but to suspend developments.
Few months went by, the restrictions eased and the world started to adapt to the new reality. We started to get more and more messages that wireless sets could be really helpful in order to maintain health standards. So we decided it was time to get back on the project.
First we put together a comprehensive plan. In 3 months we want to achieve 3 key goals: make the app available on Android and iOS, get bellguard grounding over 95% accuracy on épée, produce and send over 100 pieces of it to clubs. We will cover that in the next post.
How to make wireless scoring for fencers?
We set our goals to change fencing by making it truly wireless.
The current systems are cumbersome, have a lot of parts - even worse they are moving parts- meaning there are a lot of things likely to fail. Yet they play a very important part in fencing.
Presumably, there are very few people who don't have some hard feelings toward reels after repairing one
However if someone wants portability it’s not possible without a serious trade-off between price and accuracy. But why is it necessary to make such a compromise? What are the limiting factors to have both? Let’s see one by one:
- Make it accurate
The first major challenge in wireless scoring the lack of common ground. When 2 fencers are wired-in, they basically form big electric circuits. Put simply the fencers act like switches; when there’s a valid hit, the fencer closes a circuit and the scoring board makes a signal, when there’s an invalid hit (e.g. bellguard) a different circuit is closed and the scoreboard remains silent. However if you take away the wires, there is no circle to close hence it gets challenging to get that information. This is more widely known in fencing as the common ground problem.
Fortunately there are a few ways to get around this issue. Without getting deeply into technical details, all electric systems have quantifiable characteristics. We just need to find a metric that’s different enough when a valid touch occurs, then measure it the right way. If we can get it precise enough we can tell valid and invalid hits apart, right? This immediately leads to the next problem: the vast variety of equipment. It’s not enough to separate valid and invalid hits, we must do it consistently on all the different fencing equipment. Weapons, lamés, masks, bellguards, conductive strips, regardless of age, maintenance, material composition and so on. It’s like you need to design a bullet but it needs to fit perfectly in all the possible guns and rifles.
How could we tackle that?
There are 2 options to develop such a system. The traditional way: is to develop everything in-house trying to replicate all real-life scenarios. First, find a method to overcome the common ground problem. Second test and fine-tune it to the point of acceptable margin of error. Third make it into a commercial device and hope every possible setup was tested before the product release. It’s not hard to see the problem with this method: it gets increasingly harder to approach something close to full accuracy. In reality it’s more likely to get in the neighbourhood of 90-95% and then nearly impossible to get to 99.9% accuracy within reasonable time and cost.
So what if we could design a system that adapts to the environment? A scoring method that gest more and more accurate by usage? To achieve this we need to develop a machine learning method and feed a big pool of touch data into it. So we need to store and access the data, analyse it and finally use it to perfect the model validating. Once it’s up and running the development cycle never stops. With each update, you’ll have a better product in the morning than it was the night before.
- Make it affordable
For lowering costs there are 2 obvious ways to go: make development cheaper, and make production cheaper. Fortunately development gets much faster once the machine learning model runs thus driving it’s cost down.
How can we make production cheaper?
Well, lowering the parts needed would come handy, wouldn’t it? We already got rid of the reels and their wires and we need the body wires and the weapons - at least for the time being - so all fingers point to the scoreboard. Depending on the brand and the functionality, it can cost between 200$-1000$ for a decent training setup, so we also need to get rid of it.
- Make it trusted
There’s a third hidden factor that should be taken care of, a key advantage the old systems have: history. Since they are around for nearly 100 years there’s a level of trust that a wired system will work as it should. A wireless scoring system adds a layer of uncertainty which undermines faith in it. Why? Because it’s hard to know for sure if the fencer failed or the wireless. And people tend to blame the latter. To put it another way, it’s not enough to be accurate but it’s also crucial to appear that way. How can we overcome this hurdle? It turns out the most frequent source of equipment failures are the tips and the body wires. We need to make sure that those are in suitable conditions, and need to notify the fencer of the current condition of her equipment. In short: we need to make it smart.
Neat, isn't it?
TLDR: we need a device which can store and send data wirelessly, is smart, and everyone already has it. I guess you already figured it out, but all these factors are perfectly aligned in smartphones and tablets. And this is just the surface of it, utilising smartphones does not stop at monitoring the condition of the equipment. Not using all that data is such a waste of opportunity, and building useful features around them would make fencing so much easier for clubs and the fencers, but that's up to a future post.