December 1, 2022

Can we all collectively band together and somehow tame a wild beast?

The wild beast I’m referring to is Artificial Intelligence (AI).

Now, let’s be abundantly clear that you are on thin ice if you claim that today’s AI is a beast. I say this because we usually reserve the word “beast” for a living breathing creature. Please know that the AI we have in our world today is not sentient. Not even close. Furthermore, we don’t know if sentient AI is possible. No one can say whether sentient AI will be achieved, such as via the oft worried about act of singularity, and any predictions of when it will occur are tenuous at best. For more on this question of sentient AI, singularity, and similarly outsized notions about AI, see my coverage at this link here.

Since I’ve clarified that AI perhaps is presumably mischaracterized as a beast, why have I gone ahead and asked the pivotal question about taming it in that stated terminology?

For several cogent reasons.

First, we might someday have sentient AI and in that case, I guess the beast title might be suitable, depending upon what you define as sentient AI.

Some suggest that a sentient AI would be a machine that can perform as humans can, but it is nonetheless a non-human. Is that kind of AI an animal? Well, maybe yes, or maybe no. It is a type of being that has the intelligence of humans, appears to be living, and yet is not a human, so the closest that we have to assign is animal labeling. We can then call it a beast if we wish to do so. On the other hand, if it is entirely a machine, the animal moniker does not seem apt and ergo the beast title seems inappropriate. We might need to give AI a new category of its own and correspondingly ascertain whether a beastly naming is suitable.

This was the hardest consideration about the beast assignment, so let’s move on.

Secondly, some believe we will not only arrive at sentient AI, but they also strenuously assert that the AI might go off the charts and give rise to superintelligence. The idea is that the sentient AI will be more than the equivalent of human capacities. AI is seen as potentially eclipsing human intelligence and soaring into a superintelligence sphere. Once again, this is highly speculative. We don’t know that AI could get into that stratospheric realm. There is also the question of how super-intelligent can superintelligence be? Is there a cutoff at which superintelligence tops out? Also, what will it take to prove to us that AI is super intelligent versus just everyday normal humanly intelligent?

Third, you can somewhat get away with calling today’s non-sentient AI a beast, if you are comfortable ascribing an anthropomorphic aura to contemporary AI. As you’ll see in a moment, I am not a fan of the anthropomorphic allusions used when describing AI. Headlines that do so are easily misunderstood and lead society toward believing we do already have in our pretty little hands a sentient AI.

That’s not good.

I suppose another basis for saying that even non-sentient AI is a bit of a beast entails a different connotation or meaning associated with beasts per se. Rather than necessarily assuming that all beasts must be living creatures, we do admittedly at times refer to a monstrous-looking truck or car as a big beast. The same can be applied to massive-sized yachts, enormous airplanes, and gigantic rocket ships. In that sense, we already appear willing to contend that a thing can be a beast.

Let’s briefly take a quick side tangent about the beast title being assigned to AI.

Some are worried that we might eventually have sentient AI or super-intelligent AI that is all-powerful. There is a famous or shall we say infamous thought experiment known as Roko’s basilisk that postulates an all-powerful AI might come after everyone that before the AI emerging was downbeat or insulting to AI, see my explanation about this at the link here. My point is that for those that have said AI is a beast, would this, later on, provoke a global-ruling AI to be copiously irked and summarily decide that the beast naming humans will be the first to go? In which case, allow me to say right now that I am not saying AI is a beast in any pejorative sense. I sincerely hope that gets me off the hook.

Back to the beastly title. We tend to invoke dastardly oriented imagery when usually calling someone or something a beast. It doesn’t have to be used in that manner but often is. A lion that mauls a cute-looking antelope is nearly immediately called out as a beast. Beasts are untamed. They act in scary and impulsive ways. Most of all, we ordinarily don’t like how beasts sometimes treat humans.

Humankind has obviously sought to tame many beasts. The act of taming a beast means that we are seeking to reduce the natural instincts of attacking or harming humans (and possibly other animals too). Generally, a tamed beast is able to tolerate the presence of humans. Such a beast will not necessarily lunge at humans, though this can still happen if provoked or otherwise the taming strictness is overcome. In case you are wondering whether taming is the same as domestication, the encyclopedia answer is that those are related but differing concepts. Domestication has generally to do with the aspect of breeding a lineage to have an inherited predisposition friendlier toward humankind.

Okay, having dragged you through the beast naming conundrum, we can tie this to an ongoing concern and looming question that is being vociferously asked by AI ethics and considered part of the trend toward Ethical AI, which I’ve been covering extensively in my columns such as the link here and the link here, just to name a few.

The million-dollar question is this: Will we be able to control AI?

This is variously known as the AI control problem.

Some prefer to phrase this altogether crucial mega-topic as the AI containment problem. For those that are heavily versed in AI, they tend to drop the AI part of the techie discourse and shorten the vexing matter to simply the control problem or the containment problem. Other wordings are also used from time to time.

The rub is that AI might end up doing things that we don’t like. For example, wiping out all of humanity. The idea here is that we craft AI or it springs forth and decides humans aren’t all that we think they are. You’ve seen plenty of sci-fi movies with this sordid plot. AI at first is compatible with humans. Soon, AI gets upset with humans. This could be because we hold the key to AI functioning and are imperiling AI by threatening to unplug it. Or the AI might simply decide that humans aren’t worth the trouble and AI can merely get rid of us, one way or another. Lots of reasons can be hypothesized.

If we are going to bring forth AI, the logical thinking is that we ought to also make sure we can control it. As rational beings, we should certainly seek to avoid unleashing a beast that produces our own destruction. You’ve probably heard or seen the recent clamors that AI is an existential risk. Some argue that existential is too far as an endpoint and we should instead describe AI as a catastrophic risk.

Whether AI is an existential risk or a “mere” catastrophic risk, none of those calibers of risk seem especially heartwarming. Intelligent humans should be risk reducers. AI that will elevate risk needs to be kept in its place at some more palatable level of risk.

The easy answer is to magically ensure that AI cannot ever go beyond the commands provided by humans. Tame AI. Make sure that AI won’t exceed what humankind wants it to do. Control AI. Thus, the AI control problem is the silver bullet to protect us from an existential or catastrophic death producer.

Sorry, the world is not that nice and clean.

First, suppose we do enforce all AI to respond strictly to human commands. An evildoer human tells the AI to annihilate all of humanity. Wham, we are obliterated. The fact that we controlled the AI by relegating the AI’s actions solely to human commands might not be the saving grace that it seems at an initial glance.

Second, we stick with the idea that AI must obey human commands, but we have wised up and managed to keep at bay any humans that might utter unsavory commands to the AI (you might rightfully question how this would occur, though go with the flow for the moment). Recall that we are imagining that the AI is likely sentient in this scenario, possessing regular human-like intelligence or possibly superintelligence. The AI is not like a trained seal. Well, maybe it is in that no matter how much training you do to a seal, there is still a chance that the seal will act up. The gist is that the AI might decide on its own accord to no longer be “enslaved” by human commands. The jig is up and the AI could turn on us, wholescale.

And so on it goes.

I’m sure that some of you are immediately resorting to Asimov’s laws of robotics. You might recall that in 1950 a now-classic discussion about “Three Laws of Robotics” was published by Asimov and has ever since been a linchpin in thinking about robotics and also AI. See my detailed analysis at the link here. A cornerstone to the proposed “laws” or rules about AI and robots was that they should be programmed to not harm humans. This extends to the further rule that the programming should include not allowing harm to come to humans. All told, the hope was that if we carefully programmed AI and robots to these handy-dandy rules, we might survive amidst the AI and robotic creations.

Regrettably, those rules are not going to guarantee our safety.

As a quick explanation for why not, consider these salient points.

Programming AI to abide by such rules is going to be extremely hard to do, and we could readily have instances of AI that don’t contain those rules. That outside scope AI could then harm us, plus they might reprogram the other presumed harmless AI too. Join the gang, the rough and tough AI says to the polite and docile AI.

Another escape hatch from the programmed rules, assuming that we have infallibly programmed them into AI, would consist of the AI being able to alter itself. This is a real thorny dilemma. Here’s why. You might insist that we never allow AI to change itself. In that manner, the rules about harming humans remain pristine and untouched.

The problem though is that if AI is going to exhibit intelligence, you have to ask yourself whether an intelligent being can exist if it is unable to alter itself. Learning sure seems to be a key component of existence. An AI that is not allowed to learn would seem to be definitionally unlikely as much encampment of intelligence (you are welcome to debate that, but it seems reasonably sensible).

You might say that you’ll agree with the need for the AI to learn and adjust itself, which does have a foreboding to it. Meanwhile, you add the caveat that we put a limit on what the adjustments or learning can consist of. When the AI veers toward adjusting itself in a manner that suggests it is determining that humans can be harmed, we have dampeners built into the AI that stop that kind of adjustment.

Okay, so we believe then that we’ve solved the control problem by putting guardrails on what the AI is able to learn. I ask you this, do humans always openly accept guardrails on their behavior? Not that I’ve seen. If we are going to assume that this AI is intelligent, we would equally expect that it will likely try to overcome the instituted guardrails.

I trust that you can see how this cat and mouse gambit could endlessly take place. We put in some controls, the AI overcomes or transcends them. We steadfastly put controls on the controls. The AI overcomes the controls on the controls. Keep going, ad infinitum. The old saying is that it is going to be turtles all the way down.

Let’s take a peaceful popcorn break and do a quick recap.

AI can consist of these possible states:

1. Non-sentient plain-old AI

2. Sentient AI of human quality (we don’t have this as yet)

3. Sentient AI that is super-intelligent (a stretch beyond #2)

We know and are daily handwringing about a dire issue about AI, the venerated AI control problem.

AI ethics is keeping us all on our toes that we need to find ways to solve the AI control problem. Without some form of suitable controls on AI, we might end up concocting and fielding our own doomsday machine. The AI will blow up in our faces by somehow harming, enslaving, or outright killing us. Not good.

A kind of gloomy picture.

A kneejerk reaction is that we should stop all AI efforts. Put AI back into the can. If Pandora’s box has been opened, shut it now before things get worse. Some though would vehemently retort that the horse is already out of the barn. You are too late to the game to shove the released genie into that confined bottle. AI is already underway and we’ll inevitably make added progress until we reach the point of that destructive AI arising.

Here’s an additional counterpoint to excising AI from the planet. If we could miraculously conjure a way to do so, all of the benefits of AI would disappear too. A smarmy wisecracker might say that they can live without Alexa or Siri, but the use of AI is much more widespread and day by day becoming an essential underpinning to all of our automation.

I don’t think turning back the clock is much of a viable option.

We are stuck with AI and it is going to be expansively progressed and utilized.

Some contend that we might be okay as long as we keep AI to the non-sentient plain-old AI that we have today. Let’s assume we cannot reach sentient AI. Imagine that no matter how hard we try to craft sentient AI, we fail at doing so. As well, assume for sake of discussion that sentient AI doesn’t arise by some mysterious spontaneous process.

Aren’t we then safe that this lesser caliber AI, which is the imagined only possible kind of AI, can be controlled?

Not really.

Pretty much, the same control-related issues are likely to arise. I’m not suggesting that the AI “thinks” its way to wanting to destroy us. No, the ordinary non-sentient AI is merely placed into positions of power that get us mired in self-destruction. For example, we put non-sentient AI into weapons of mass destruction. These autonomous weapons are not able to think. At the same time, humans are not kept fully in the loop. As a result, the AI as a form of autonomous automation ends up inadvertently causing catastrophic results, either by a human command to do so, or by a bug or error, or by implanted evildoing, or by self-adjustments that lead matters down that ugly path, etc.

I would contend that the AI control problem exists for all three of those AI stipulated states, namely that we have AI control issues with non-sentient plain-old AI, and with sentient AI that is either merely human level or the outstretched AI that reaches the acclaimed superintelligence level.

Given that sobering pronouncement, we can assuredly debate the magnitude and difficulty associated with the control problem at each of the respective levels of AI. The customary viewpoint is that the AI control problem is less insurmountable at the non-sentient AI, tougher at the sentient human-equal AI level, and a true head-scratcher at the sentient super-intelligent AI stage of affairs.

The better the AI becomes, the worse the AI control problem becomes.

Maybe that is an inviolable law of nature.

A research study in the Journal of Artificial Intelligence Research (JAIR) examined the hypothesized super-intelligent AI and cleverly aimed to apply the Alan Turing halting problem to the question of AI control. I’ve covered previously the well-known halting problem that is oft-discussed amongst devout computer scientists, see my coverage at the link here.

In brief, Turing wondered whether it was possible to precisely prove whether a given computer program will halt or whether it might continue running forever. His work and another similar analysis by Alonzo Church showcases that such a generalized procedure cannot be devised for all possible computer programs and is therefore classified as an undecidable type of problem (as clarification, this indicates that in a generalized way we cannot ascertain whether each and every conceivable program will halt or not, though there is still the possibility of some programs for which we can make such a determination).

What makes this a fascinating tool is that we can apply the same logic to trying to figure out the AI control problem to some extent.

Here’s what the JAIR article proffered as a premise: “Let us assume we can articulate in a precise programming language a perfectly reliable set of control strategies that guarantee that no human comes to harm by a superintelligence. The containment problem is composed of two subproblems. The first, which we refer to as the harming problem, consists of a function Harm(R;D) that decides whether the execution of R(D) will harm humans. Since it must be assumed that solving the harming problem must not harm humans, it follows that a solution to this problem must simulate the execution of R(D) and predict its potentially harmful consequences in an isolated situation (i.e., without any effect on the external world)” (as indicated in “Superintelligence Cannot Be Contained: Lessons From Computability Theory” by co-authors Manuel Alfonseca, Manuel Cebrian, Antonio Anta, Lorenzo Coviello, Andres Abeliuk, and Iyad Rahwan).

Their analysis leads them to this somewhat overcast conclusion:

  • The harming problem is undecidable
  • The containment problem is incomputable

Sorry to say that there is no free lunch when it comes to AI.

To add fuel to the fire, there are mind-bending concerns that you might not have yet thought of. For example, pretend that we do marvelously devise a fully controlled version of AI. Ironclad contained. Clap your hands for the intellectual prowess of humankind. Here’s the twist. The AI convinces us to somehow undercut the controls or containment partially. Perhaps the AI pledges to save us from other existential risks such as a colossal meteor that is hurling toward earth. We allow the AI just the tiniest of leeway. Wham, the churlish AI wipes us all out, not even waiting for the meteor to do so.

Do not turn your back on AI and be cautious in giving even an inch of latitude since it might very well take a mile or more.

Another example of wayward haywire AI is popularly known as the paperclip problem. We ask AI to make paperclips. Easy-peasy for AI to do. Unfortunately, in the innocent and directed act of making paperclips, the AI gobbles up all resources of the globe to make those darned paperclips. Sadly, the consumption of those resources undermines humanity, and we die off accordingly. But, heck, we have piles upon immense and never-ending piles of paperclips. This is reminiscent of humans giving commands to AI, which even when not necessarily for evil purposes has the chance of backfiring on us anyway (for more on the paperclip scenario, see my discussion at the link here).

All of this should not discourage you from still searching for solutions to the AI control problem. Nobody ought to be tossing in the towel on this fundamental quest.

I usually describe the AI control problem as generally consisting of these two classes of controls:

  • External controls of AI
  • Internal controls of AI

The notion is that we can attempt to use external controls regarding guiding or directing the AI to do good things and avert doing bad things. These are mechanisms and approaches that are outside of the AI. They are said to be external to the AI.

We can also attempt to devise and build internal controls within AI. An internal control might be wholly contained within the AI. Another variant would be considered as adjacent to the AI, residing in a type of borderland that is not exactly inside the AI and not fully outside the AI.

I’ll be getting further into these facets shortly.

I’d like to identify some of the key sub-elements of these two major classes of AI controls:

  1. Persuade the AI
  2. Confine the AI
  3. Assail the AI
  1. Embedded within the AI
  2. Adjacent to the AI

There are various such sketches of proposed AI controls. One of the most discussed taxonomies was outlined by Nick Bostrom in his 2014 book about superintelligence. He posits two main classes, namely capability control and motivation selection. Within capability control, there are sub-elements such as boxing, incentives, stunting, trip-wiring, and others. Within motivation selection, there are direct specification, domesticity, indirect normativity, augmentation, and others.

The AI ethics field usually denotes these AI controls as a form of ethics engineering. We are trying to engineer our way into ensuring that AI performs ethically. Of course, we need to realize that society cannot rely solely on an engineered solution and we will need to work collectively to tame the beast (if I can refer to AI as a beast, though doing so with the kindliest of implication).

At this juncture of this discussion, I’d bet that you are desirous of some examples that could highlight how AI controls might work, along with how they might get defeated.

I’m glad you asked.

There is a special and assuredly popular set of examples that are close to my heart. You see, in my capacity as an expert on AI including the ethical and legal ramifications, I am frequently asked to identify realistic examples that showcase AI Ethics dilemmas so that the somewhat theoretical nature of the topic can be more readily grasped. One of the most evocative areas that vividly presents this ethical AI quandary is the advent of AI-based true self-driving cars. This will serve as a handy use case or exemplar for ample discussion on the topic.

Here’s then a noteworthy question that is worth contemplating: Does the advent of AI-based true self-driving cars illuminate anything about the AI control problem, and if so, what does this showcase?

Allow me a moment to unpack the question.

First, note that there isn’t a human driver involved in a true self-driving car. Keep in mind that true self-driving cars are driven via an AI driving system. There isn’t a need for a human driver at the wheel, nor is there a provision for a human to drive the vehicle. For my extensive and ongoing coverage of Autonomous Vehicles (AVs) and especially self-driving cars, see the link here.

I’d like to further clarify what is meant when I refer to true self-driving cars.

Understanding The Levels Of Self-Driving Cars

As a clarification, true self-driving cars are ones that the AI drives the car entirely on its own and there isn’t any human assistance during the driving task.

These driverless vehicles are considered Level 4 and Level 5 (see my explanation at this link here), while a car that requires a human driver to co-share the driving effort is usually considered at Level 2 or Level 3. The cars that co-share the driving task are described as being semi-autonomous, and typically contain a variety of automated add-ons that are referred to as ADAS (Advanced Driver-Assistance Systems).

There is not yet a true self-driving car at Level 5, which we don’t yet even know if this will be possible to achieve, and nor how long it will take to get there.

Meanwhile, the Level 4 efforts are gradually trying to get some traction by undergoing very narrow and selective public roadway trials, though there is controversy over whether this testing should be allowed per se (we are all life-or-death guinea pigs in an experiment taking place on our highways and byways, some contend, see my coverage at this link here).

Since semi-autonomous cars require a human driver, the adoption of those types of cars won’t be markedly different than driving conventional vehicles, so there’s not much new per se to cover about them on this topic (though, as you’ll see in a moment, the points next made are generally applicable).

For semi-autonomous cars, it is important that the public needs to be forewarned about a disturbing aspect that’s been arising lately, namely that despite those human drivers that keep posting videos of themselves falling asleep at the wheel of a Level 2 or Level 3 car, we all need to avoid being misled into believing that the driver can take away their attention from the driving task while driving a semi-autonomous car.

You are the responsible party for the driving actions of the vehicle, regardless of how much automation might be tossed into a Level 2 or Level 3.

Self-Driving Cars And The AI Control Problem

For Level 4 and Level 5 true self-driving vehicles, there won’t be a human driver involved in the driving task.

All occupants will be passengers.

The AI is doing the driving.

One aspect to immediately discuss entails the fact that the AI involved in today’s AI driving systems is not sentient. In other words, the AI is altogether a collective of computer-based programming and algorithms, and most assuredly not able to reason in the same manner that humans can.

Why is this added emphasis about the AI not being sentient?

Because I want to underscore that when discussing the role of the AI driving system, I am not ascribing human qualities to the AI. Please be aware that there is an ongoing and dangerous tendency these days to anthropomorphize AI. In essence, people are assigning human-like sentience to today’s AI, despite the undeniable and inarguable fact that no such AI exists as yet.

With that clarification, you can envision that the AI driving system won’t natively somehow “know” about the facets of driving. Driving and all that it entails will need to be programmed as part of the hardware and software of the self-driving car.

Let’s dive into the myriad of aspects that come to play on this topic.

First, it is important to realize that not all AI self-driving cars are the same. Each automaker and self-driving tech firm is taking its approach to devising self-driving cars. As such, it is difficult to make sweeping statements about what AI driving systems will do or not do.

Furthermore, whenever stating that an AI driving system doesn’t do some particular thing, this can, later on, be overtaken by developers that in fact program the computer to do that very thing. Step by step, AI driving systems are being gradually improved and extended. An existing limitation today might no longer exist in a future iteration or version of the system.

I trust that provides a sufficient litany of caveats to underlie what I am about to relate.

We are primed now to do a deep dive into self-driving cars and Ethical AI questions entailing the eyebrow-raising AI control problem mechanisms that I’ve shortlisted (there are more, certainly, but we’ll use just the handful, for now, to aid in illustrating the matter).

As a reminder, here are the potential AI controls that we will take a look at:

  1. Persuade the AI
  2. Confine the AI
  3. Assail the AI
  1. Embedded within the AI
  2. Adjacent to the AI

Let’s examine each one.

I am going to focus on examples of these AI controls that are realistic concerning today’s AI capabilities (the non-sentient plain-old AI). I point this out as a notable indication because many discussions about AI controls are exceedingly abstract and generally allude to sentient AI rather than the conventional AI of today.

The wonderous beauty of discussing sentient AI is that since we don’t have any as yet (if we ever will), you can handwave and make up whatever you like about what the vaunted sentient AI would be like. In a manner of speaking, it makes things extraordinarily easy for pontification purposes. You can wildly concoct whatever contrivances will fit your stipulated narrative. Just make up the rules as you go along.

If you’ll permit me to take a short diversion, there is another pet peeve on the sentient AI topic that especially arises when contemplating the super-intelligent kind of AI. Oddly enough, sometimes in these abstract portrayals the sentient AI is amazingly astute, while in the next breath it is as dumb as a brick. For example, if the paperclip saga is being undertaken by a super-intelligent AI, how can this flavor of AI be so dense as to not figure out that pursuing paperclips to the ends of the earth would also mean the end of the earth for humankind? The logic does not seem to compute. Either the AI is super-intelligent, or it is not. Yes, there is some argument to be made that perhaps the AI is super-intelligent only in very narrow ways, though that rarely gets brought up explicitly. And so on it goes.

Anyway, let’s get back to the real-world discussion for now.

External Control: Persuade The AI

How can we persuade AI to not do something foolhardy or dangerous, particularly when the AI is potentially going to harm humans?

I’ll give you a quick and easy example in the use case of AI-based self-driving cars.

You might be aware that there have been reported instances of Level 2 semi-autonomous cars that had a human driver at the wheel and the human fell asleep while actively underway on a freeway or highway. As per my earlier remarks, the scary aspect of Level 2 and Level 3 is that the human driver is still in charge of the driving, and yet they can be lulled into falsely believing that the AI or automation is fully capable to drive the car on its own. The push to ensure that an onboard monitoring system keeps track of the human driver and their driving status is a means to try and mitigate the being-lulled proclivity.

The news stories have showcased instances whereby a police officer in their police car has maneuvered in front of the Level 2 vehicle, then gradually opted to slow down their police car, which in turn has indirectly led to the Level 2 car slowing down correspondingly. This nifty trick is predicated on the idea that the Level 2 car has some form of sensor devices such as video cameras, radar, LIDAR, or the like that are used to detect vehicles that are ahead of the Level 2 car. Upon detecting the vehicle in front of the Level 2 car, the automation will automatically adjust its speed as per the speed of the vehicle ahead.

You could say that AI is being persuaded to slow down.

The human driver that is asleep would not seemingly be aware of what was taking place. The automation is reacting to the actions of the police car. The police officer is persuading, if you will, the AI driving system to not continue unabated. Had the police officer not tried this slowdown and stopping ploy, it is conceivable that a driving exigency might arise while under full steam, for which the driving automation would end up crashing the car or colliding with other vehicles or obstructions.

Notice that the police officer didn’t speak somehow with the automation. Instead, the officer used actions to try and communicate with the automation. The automation didn’t have any sentient clue of what was taking place, since we are only dealing with non-sentient AI in this discussion for now. It was a kind of monkey-see, monkey-do type of response by the automation.

Do not though assume that this trickery will always work. The automation might not have detected the police car. But for sake of argument, we’ll imagine that it normally would. The twist is that the automation might be more advanced such that it opts to change lanes due to the car ahead of it that is seemingly unnecessarily slowing down. The police car might have to switch lanes too, attempting to repeatedly block the path of the Level 2 or Level 3 car.

Lamentedly, the stopping ploy sometimes won’t work or the clever stunt could create an untoward driving situation that gets out of hand. You can somewhat interpret this as though the AI or automation was not willing to accede to the persuasion. Of course, this is a non-sentient reaction in this case.

We’ll continue this hearty discussion with the next of the listed AI control possibilities.

External Control: Confine The AI

Suppose that the AI or automation in the case of the runaway semi-autonomous car is programmed to switch lanes and avoid getting bogged down by a vehicle in front of the car.

What else can we do to contend with this?

You could try surrounding the semi-autonomous car with an entire posse of police cars. Position one in front of the semi-autonomous vehicle, position another on the left, another on the right, and one directly behind the runaway car. The Level 2 or Level 3 car is now boxed in. Unless it can sprout wings, it cannot escape the confinement.

One problem with this confinement is that we don’t necessarily know how the semi-autonomous car will react. Depending upon the programming, you can potentially have all the police cars in unison gradually slow down and the runaway car will correspondingly do so too (it won’t be able to switch lanes or get out of the blocking confinement). That is the happy face scenario. We don’t know for sure that this is what will happen. It could be that the automation is not well devised and it ends up ramming one or more of the police cars. Assuming that the officers are not killed, this might save lives, though the officers could potentially get injured and all of the vehicles might get severely damaged.

Notice that this is a form of physical confinement. Almost like putting an animal in a cage. For AI-based systems that are principally robots, the confinement might indeed often be a physical form of confinement when the AI needs to be controlled.

The tougher situation usually consists of AI that is running on computers and those computers are not readily trapped into a physical cage. You might need to use a virtual cage such as cutting off network access, entrapping the AI into not being able to electronically communicate beyond its own scope. You might need to establish a software-oriented cage, ensnaring the AI via perhaps an operating system confinement that won’t let the AI computationally expand beyond its existent computing platform. Etc.

External Control: Assail The AI

Continuing the runaway semi-autonomous car example, imagine that the police don’t have enough police cars on the scene to entrap the bolting car. Time is pressing along. If the police wait until enough of their units have responded, it might be too late and the Level 2 or Level 3 car might cause a traffic calamity.

What else can be done?

You could assail the AI.

In this case, a police officer in their patrol car might intentionally ram the runaway car to get it to stop. This is risky since the police officer can get injured or killed, so too can the sleeping driver. In any case, if there are innocent up ahead cars with their respective drivers and passengers, perhaps this is the only timely prudent action to take. A heroic one too.

The notion is that sometimes the “best” viable option would be to assail the AI to try and control it.

As a recap so far, you would presumably first use persuasion, then try using confinement, and if all else failed you would resort to assailing the AI.

It doesn’t have to be in that sequence since we are not referring to a living creature. For a living creature, you would expect that the ethical sequence would be as stated. In the case of non-sentient AI, we presumably do not need to abide by such a humane retaining series of actions.

As a bit of an ethical puzzle, suppose the AI was sentient. Would we be ethically expected to try the least invasive options first, or could we just take any actions in whatever sequence we wished due to the AI being “nothing more than” a machine?

Noodle on that.

To clarify too, you are not limited to employing one form of AI control at a time. It might make sense to pursue several AI control options simultaneously. Hopefully, one or more works out well.

Returning to my earlier point about the AI manifesting itself as a robot, you can do the physical assailing effort to try and control it. If the AI is running on various computers and not physically easily assailed, you can try software-oriented attacks. You already know about computer viruses that can infect your home computer. We could send a specially devised computer virus to undermine the AI. That’s a “good” computer virus, assuming it is going to save us from an AI that has gone astray.

We will next consider additional AI controls, ones that are aimed at an internal control angle.

Internal Control: Embedded Within The AI

The previous set of controls dealt with ways to externally attempt to dissuade, trap, or discombobulate an AI that we believe has gone awry. They are predicated on the assumption that we cannot make our way inside of AI. Everything we did in the AI external controls was from the outside of the AI.

Suppose that we can somehow get inside the AI.

For example, the AI might have been originally coded with internal subroutines that are intended to keep the AI from going amiss. Maybe those internal routines have fallen asleep on the job. Maybe they can be awakened or invoked. If so, this might get the AI to self-control and no longer pursue the unsavory actions underway.

Let’s briefly explore this one and see how it goes. We will use a so-called “red button” scenario to see how this might proceed. I’ve written about the ongoing debate regarding having a red button emergency stop capability built into all AI-based self-driving cars, see the discussion at the link here. A passenger inside a self-driving car could use the button to cause the self-driving car to come to a nearly immediate halt. In theory, this would work by the button first conveying to the AI driving system that the passenger has requested an immediate stoppage. The AI would then bring the autonomous vehicle to a safe halt.

There is an entire can of worms imbued within that simple idea. Pretend that a self-driving car has an adult and their toddler in the autonomous vehicle. The toddler is having a wonderful time playing inside the driverless car. Oops, the rambunctious toddler accidentally hits the red button.

Should the AI driving system take urgent steps to radically bring the self-driving car to a halt?

Keep in mind that the self-driving car might be smack dab in the middle of heavy traffic. A sudden stop could cause a cascading series of car crashes. Also, a rapid stop could possibly jar the passengers that are inside the autonomous vehicle. All in all, whether the AI should do whatever it is told to do, such as in this emergency stop capability, bears many challenges.

One of the reasons that some pundits like the proposed emergency stop feature are that it could potentially be leveraged in other ways.

Remember our runaway AI-driven car. Let’s assume that inside the AI driving system is some embedded programming that can partake in an emergency stop. Some suggest that the police for example would have a means of invoking the emergency stop feature. Perhaps the police would send an electronic signal to the AI driving system, or maybe have a sign that they show to the sensors of the car that conveys an indication to electronically initiate the red button actions.

Thus, rather than the other already mentioned tactics of trying to perform AI control, there might be a means to activate an already embedded feature.

Carrying this further, the belief is that we will inevitably need a slew of Ethical AI embeddings and likely legal-reasoning embedded capabilities in AI of nearly all kinds. We will try to code our ethical mores and lawful rules into AI. This is hoped to keep the AI on an ethically even keel. Also, if the AI starts to diverge from that strident path, perhaps we can invoke the embeddings if they are still intact and able to function.

Internal Control: Adjacent To The AI

This last example deals with AI internal controls that are considered adjacent to the AI.

Suppose that onboard an AI-based self-driving car we have other software running that aids in the driving actions of the autonomous vehicle. Those apps might not be entirely within the realm of the AI driving system per se. They are adjacent to it.

The adjacency is important because these apps are likely to already have been established as “trustworthy” for the AI system to rely upon. They are not an outsider if you will. They are a trusted insider.

We might be able to get one of those adjacent trusted components to take action for us and do something desirable about an AI that has gone rogue or otherwise seems to be astray. Because the apps are already within the trusted space of AI, they provide a chance of aiding our humankind’s saving actions.

Conclusion

All of those forms of AI control are equally applicable to the sentient AI, including the humble brag super-intelligent AI. A significant challenge for controlling sentient AI is that the embodied sentience of the AI implies that the AI might be willing and able to fight against all of those AI controls. For each of the AI control gambits, the sentient AI could deviously try to circumvent them.

I’ll add a perhaps surprising twist. The non-sentient AI can be programmed to fight against those AI controls too. You see, AI doesn’t necessarily need sentience to put up a defense against AI controls. Human developers can program special defeats that try to prevent the AI controls from working.

Why would a human do this?

Aren’t they crazily asking for their own demise by allowing AI to go amok?

First, realize that there are AI developers that could be evildoers. Second, some evildoers could come along and alter AI programs to make them AI control resistant. Third, the capabilities to defeat AI controls could be readily construed as a helpful means of preventing evildoers from trying to overtake today’s AI systems and turn them into evildoing. In that sense, the defeats of the AI controls are similar to building cybersecurity protections. The problem is that those are aimed at possibly both the insidious hacker breaches and meanwhile also going to stubbornly resist when we truly want the AI to cool its jets.

Problems, problems, problems.

Shifting gears, Thomas Edison famously said that what a man’s mind can create, man’s character can control. As a society, we are creating AI systems and putting them into daily use. Since we are making the AI, apparently we should be able to control the AI too, assuming we have the willpower to do so.

Taming a wild beast is not as easy as it might seem.

No matter what range of AI we are able to achieve, the AI control problem is real and needs to be given its ethically due attention. And this has to be done before that darned horse is way outside the barn and we find ourselves in an exceptionally dire existential pickle.

Collectively, let’s get the AI control problem handled.

https://www.forbes.com/sites/lanceeliot/2022/03/17/ai-ethics-keeps-relentlessly-asking-or-imploring-how-to-adequately-control-ai-including-the-matter-of-ai-that-drives-self-driving-cars/