Ben on BBO - Feedback thread Discussions about Ben models trained on BBO data
#62
Posted 2024-January-17, 13:57
http://tinyurl.com/yw279eju
#63
Posted 2024-January-17, 14:38
diana_eva, on 2024-January-17, 13:57, said:
http://tinyurl.com/yw279eju
If you were playing with GIB, you could argue that playing a high spade is correct, because South is guaranteed to hold the ♠Q for the 1♥ response; if that's your only spade left like here, then playing low is fatal..
#64
Posted 2024-January-17, 14:42
smerriman, on 2024-January-17, 14:38, said:
Ah I didn;'t think of it that way. Yes maybe it is right to play in that order if you want to avoid endplaying partner. I guess JAK was not obvious after all. Thanks smerriman.
#65
Posted 2024-January-17, 18:00
My double here was dubious regardless, but I was expecting a stronger hand from Ben based on the explanation shown for the 3C bid.
Is there any sort of check by Ben as to whether its chosen bid matches the discription that will be displayed for it?
#66
Posted 2024-January-18, 00:35
Perhaps the art of bidding slams off two aces is too intricate for Ben to try
#69
Posted 2024-January-18, 02:23
It is an interesting experiment, and there is likely plenty of room for improvement when it comes to trying to score higher scores when playing with itself, as per the other threads comparing it to the performance of other robots.
But when it comes to Ben being a viable robot partner for a human.. I just can't see a future there.
When playing with a robot, you want a partner that is consistent and understandable. Not necessarily flawless, but with its limitations understandable, and with the potential for improvement (the biggest letdown by BBO with GIB).
Ben doesn't have the slightest clue what any bids mean. It pretends to, via the alerts that BBO copy from GIB, even though Ben never sees them. But it doesn't; it will pass forcing bids, give the wrong response to Blackwood (or if you're responding, not use the information it receives at all), preempt and raise its preempt, massively overbid, massively underbid, and countless others - after you've started experiencing some of these, you end up not confident in any of its bids at all.
Learning from example also means it's considerably worse at handling psyching than GIB; after some poor results in the Ben & Friends daylong playing normally, I switched to psyching on every single hand, resulting in 1st (10% higher than second place), 2nd, and 3rd places - and the 2nd was when I had played normally for the first 2 boards, made a mistake for a 0% board, before going back to psyching to get back to a 63% average.
Correct me if I'm wrong, but other than rare exceptions, like raising its own doubled contract hoping it won't be doubled again, the way its coded appears to mean it's not possible for a developer to fix the meanings of any bids, because none of the meanings of any bids are part of the program.
To try to get it to better understand a bid, sure, you could give it a large sample of hands for the situation in question along with the correct bid. But generating a sufficient set of hands is equivalent to programming the proper definition of the bid in the first place; if you can do that in enough cases that actually matter, you probably have a better robot than GIB without needing the AI in the first place. Starting to hardcode conventions like Blackwood into Ben feels like a deep rabbit hole..
I can see potential uses of AI for improving existing robots - it would be an interesting project to take a large set of hands played with GIB and use them to determine where GIB's most common flaws are..
But there are so many areas of GIB where an achievable amount of work put in would result in a massive increase in enjoyment - I've posted about several of them in depth over the years. Ben just isn't *fun* to play with or even post bug reports, and the way it's coded I just can't see that changing.
#70
Posted 2024-January-18, 03:56
When GIB is on firm grounds as it generally is in the first round of bidding and maybe also when giving a signal on partner's opening lead, GIB doesn't need help from a neural network, and the neural network will probably be worse than GIB, especially as a partner for a human.
On the other hand, in most cardplay situations, especially when declaring, and also in convoluted auctions where GIB defaults to stupid rules like "double is t/o but basically just shows 13 cards", "if noting fits just pass" or "4♣ shows the values to make 4♣ i.e. 24+ total points", the neural network could potentially do better.
#71
Posted 2024-January-18, 04:47
Or maybe just treats it with contempt.
http://tinyurl.com/yt7w289q
#72
Posted 2024-January-18, 07:59
fuzzyquack, on 2024-January-18, 02:15, said:
good point about discarding hearts!
i'm not following what you mean with Ben S. (also not claiming that Ben would have discarded hearts if it was sitting south )
#73
Posted 2024-January-18, 08:02
smerriman, on 2024-January-18, 02:23, said:
It is an interesting experiment, and there is likely plenty of room for improvement when it comes to trying to score higher scores when playing with itself, as per the other threads comparing it to the performance of other robots.
But when it comes to Ben being a viable robot partner for a human.. I just can't see a future there.
When playing with a robot, you want a partner that is consistent and understandable. Not necessarily flawless, but with its limitations understandable, and with the potential for improvement (the biggest letdown by BBO with GIB).
Ben doesn't have the slightest clue what any bids mean. It pretends to, via the alerts that BBO copy from GIB, even though Ben never sees them. But it doesn't; it will pass forcing bids, give the wrong response to Blackwood (or if you're responding, not use the information it receives at all), preempt and raise its preempt, massively overbid, massively underbid, and countless others - after you've started experiencing some of these, you end up not confident in any of its bids at all.
Learning from example also means it's considerably worse at handling psyching than GIB; after some poor results in the Ben & Friends daylong playing normally, I switched to psyching on every single hand, resulting in 1st (10% higher than second place), 2nd, and 3rd places - and the 2nd was when I had played normally for the first 2 boards, made a mistake for a 0% board, before going back to psyching to get back to a 63% average.
Correct me if I'm wrong, but other than rare exceptions, like raising its own doubled contract hoping it won't be doubled again, the way its coded appears to mean it's not possible for a developer to fix the meanings of any bids, because none of the meanings of any bids are part of the program.
To try to get it to better understand a bid, sure, you could give it a large sample of hands for the situation in question along with the correct bid. But generating a sufficient set of hands is equivalent to programming the proper definition of the bid in the first place; if you can do that in enough cases that actually matter, you probably have a better robot than GIB without needing the AI in the first place. Starting to hardcode conventions like Blackwood into Ben feels like a deep rabbit hole..
I can see potential uses of AI for improving existing robots - it would be an interesting project to take a large set of hands played with GIB and use them to determine where GIB's most common flaws are..
But there are so many areas of GIB where an achievable amount of work put in would result in a massive increase in enjoyment - I've posted about several of them in depth over the years. Ben just isn't *fun* to play with or even post bug reports, and the way it's coded I just can't see that changing.
this feedback is a bit deflating.
but i appreciate reading it because the thoughts and arguments are good.
thanks for taking the time to play with it, and thanks for giving detailed and honest feedback.
#74
Posted 2024-January-18, 11:44
lorserker, on 2024-January-18, 07:59, said:
i'm not following what you mean with Ben S. (also not claiming that Ben would have discarded hearts if it was sitting south )
Sorry, my bad. It was diana_eva who could discard ♥s to help Ben N. Can't expect Ben to cover up for humans ;-})
#76
Posted 2024-January-18, 15:32
smerriman, on 2024-January-18, 02:23, said:
When playing with a robot, you want a partner that is consistent and understandable. Not necessarily flawless, but with its limitations understandable, and with the potential for improvement
If we imagine Bridge robots in the framework of an actual game where you choose the robot as a partner vs a pair of humans or two other robots this is a reasonable complaint.
OTOH robots on BBO also function as the digital equivalent of a pinball machine: each one with its own quirks requiring a different approach to maximise the result.
Thought about this way, Ben is no different.
It's not as if good results can't be achieved with unusual bidding against GIB/Argine etc.
Ben is another fun challenge; the more the better.
Ultimately, nobody will ever develop a true bridge playing robot until it can:
1. Know when it isn't sure what an op bid means.
2. Ask them what it means and alter its behaviour in response.
3. Understand slightly sly replies.
4. Buy a round at the pub afterwards.
etc etc.
#77
Posted 2024-January-18, 18:21
Would love to hear if you believe they are fixable / the methods you had in mind of how that would be done.
Agree with Helene above a hybrid option with an existing robot does feel like it could have a real impact though. It's not necessarily the first round vs later rounds, but more than some bids have exact meanings, and some have fuzzy meanings; the former aren't negotiable and must be concretely programmed, but the latter is where existing robots can fall down.
#78
Posted 2024-January-18, 21:32
http://tinyurl.com/5auwkv5x
Opposite GIB, I feel like doubles are very rarely left in. It was good to see Ben take the money on this hand. And clearly the bidding was erratic at all the tables. There were 34 different results!
http://tinyurl.com/544r58zb
#80
Posted 2024-January-19, 07:27
smerriman, on 2024-January-18, 18:21, said:
I think a lot of those problems will be fixed. When we started with machine translation and large language models it was also difficult to see how this could ever be useful, but here we are.
Some thoughts about how I expect Ben2025 to be better:
Statistically stable mimic of the deterministic areas of GIB's bidding
By training Ben on a billion hands, Ben will have learned from millions of Blackwood sequences so it won't be necessary to code the answers to Blackwood.
Statistically stable mimic of GIB's sims
Ben can behave like a GIB that did thousands of sims for each decision.
Evaluate sims based on what worked in the training set, not what works DD
Ben may become more aggressive than GIB because opps don't always make the right decision as to whether to take the push and whether to double, plus there's declarer's advantage.
Use data where a human partnered GIB
This is maybe more limitied because you can't simulate billions of such hands, and many humans are too stupid to learn from. But when making bidding decisions, Ben could maybe learn a bit from succesful human actions partnering GIB, and from succesful GIB actions partnering a human.
Ultimately, Ben should improve GIB's bidding system also, in particular by filling in the gabs such as ill-defined high-level doubles. This will be a lot more challenging but I am sure Lorseker will come up with something