Friday, November 25, 2011

Computer Science: NP-complete Problems are Really NP-complete

First of all, let me apologize for my sloppy terminology. I've always been a better software engineer than a computer scientist.

This is how Wikipedia describes NP-complete problems:
In computational complexity theory, the complexity class NP-complete (abbreviated NP-C or NPC) is a class of decision problems. A decision problem L is NP-complete if it is in the set of NP problems so that any given solution to the decision problem can be verified in polynomial-time, and also in the set of NP-hard problems so that any NP problem can be converted into L by a transformation of the inputs in polynomial-time.
Although any given solution to such a problem can be verified quickly, there is no known efficient way to locate a solution in the first place; indeed, the most notable characteristic of NP-complete problems is that no fast solution to them is known. That is, the time required to solve the problem using any currently known algorithm increases very quickly as the size of the problem grows. As a result, the time required to solve even moderately large versions of many of these problems easily reaches into the billions or trillions of years, using any amount of computing power available today. As a consequence, determining whether or not it is possible to solve these problems quickly, called the P versus NP problem, is one of the principal unsolved problems in computer science today.

While a method for computing the solutions to NP-complete problems using a reasonable amount of time remains undiscovered, computer scientists and programmers still frequently encounter NP-complete problems. NP-complete problems are often addressed by using approximation algorithms.
Note that it says, "As a consequence, determining whether or not it is possible to solve these problems quickly, called the P versus NP problem, is one of the principal unsolved problems in computer science today." I haven't seen the proof for it, but I've also heard that if you could prove that one NP-complete problem does not have a polynomial-time solution to it, then that would prove that none of them have a polynomial-time solution to them. The theory is a little over my head, but I'm going to take a shot.

I'd like to propose a problem that is NP-complete and show that it does not have a polynomial-time solution to it. Imagine I pick an n-tuple of random 32-bit, unsigned integers. Your job is guess what the n-tuple is. Hence, where n is 2, I might pick (376, 792), and your job is to guess that tuple. If you guess a certain tuple, I can tell you whether you're right or not. Hence, I can verify a solution in polynomial-time (it takes me O(n) number of int comparisons to verify a solution). However, to try to solve such a problem, you either have to get really lucky or you have to brute force it. If you find a method other than brute force trying every possible solution, then obviously there's something wrong with my random number generator. To use brute force to guess the solution requires O(A^n) for some constant A. Since it's "to the n", it's not polynomial-time (3nA^3 is polynomial-time since the exponent is fixed; A^n is not polynomial-time since the exponent varies with the size of n).

Now I know that some of you reading this understand the computer science a lot better than I do. Have I actually shown (at least informally) that NP-complete problems are really NP-complete? If not, can you, in lay man's terms explain what I'm misunderstanding? Thanks!

15 comments:

Aaron said...

Just one thought - you're saying that the only other alternative to brute forcing it is to get "lucky." I think you're being disingenuous as to what "lucky" is here.
Everyone studying these problems has full access to every other lemma and theorem out there. Getting "lucky" is often a simple matter of combining several of these lemmas and theorems together in interesting new ways.

It's important to note that "lucky" is not the same as "trial and error" or "random."

Shannon -jj Behrens said...

When I said "lucky", I was just trying to be funny. There aren't any lemmas or theorems that will help you figure out which random n-tuple I have in mind other than brute forcing it. When you try one n-tuple, I tell you that you're either wrong or right. I don't give you any information that will help you get substantially closer to the right answer. (For instance, I don't tell you if you're partially right.)

Shannon -jj Behrens said...

Shawn Drape said:

I think you also have to prove that your proposed problem can be transformed into a known NP-Complete problem in polynomial time. Use the traveling salesman as a target.

To which I responded:

Hmm, here is a way to convert the problem into the traveling salesman problem. Number each of the cities. Your trip among the cities is a tuple of numbers. If the cost of each trip between cities is completely random (i.e. if you don't assume that close cities are cheaper to travel between than distant cities), then the optimal solution is an N-tuple of city numbers. Either way, you have to at least partially examine every possible solution.

Alexey Romanov said...

The problem with your problem is that it isn't a decision problem :)

A decision problem for tuples of integers is basically just a set of such tuples. And given a tuple, you must determine whether it's in the set. Note that there is nothing adversarial here: you can't keep the set "secret".

So, if the set only contains one tuple, the algorithm is simple: compare argument tuple with this tuple; if they are the same, return "yes", otherwise return "no". This problem is in P (and even in L).

Shannon -jj Behrens said...

Alexey, thanks a lot for your response. Let me see if I can turn it into a decision problem.

Let's say I have an encrypted message. I already know what it's supposed to say, and I already know the algorithm that was used to encrypt it. What I don't know is the key used to encrypt it. My goal is to find the key so that I can decrypt other messages.

The key happens to be an n-tuple of integers. The algorithm is such that there is exactly one n-tuple that can be used to decrypt the message in such a matter that I end up with the original message. Furthermore, the algorithm is such that the person who picked the key had the freedom to choose any n-tuple of 32-bit, unsigned integers.

Given all that I know, I must decide what the key is.

Vincent said...

Just like to point out that using tuple is no different that just guessing a any integer. The set of all possible integer tuples is the same size as the set of integers really they are the same size the set of reals is larger though.

Is N know to the guesser?

Not sure if this is a "decision" problem but once you guess a no more that is bigger that the solution , chose the largest 32bit integer, then the problem is polynomial. Just count 1,2,3,4,5 ( keeping in mind that the tuples a the same size as the integers and therefor can be ordered/sorted/counted.
http://en.wikipedia.org/wiki/Countable_set

Paul Du Bois said...

For a problem to be NP complete it is necessary to show the reduction in both directions: show that the problem is no easier than other NP-Complete problems (convert a known NPC to this problem in poly time) and no harder as well (convert this problem to a known NPC in poly time).

I think you've only given a single direction.

Alexey Romanov said...

1. Again, "what the key is" is not a yes/no question. You can instead use something like "given plain text message, encrypted message, numbers i and m, is the i-th element of the key greater than m?"

2. For example, say your algorithm just XORs plain text with repeated copies of the key. It does satisfy your conditions (you can use any n-tuple as the key and there is no more than 1 key which converts the given plain text to the given encrypted text), but it's trivial to find the key: just XOR plain and encrypted messages.

3. So you need to pick some algorithm and prove that there is basically no other way to find the key except by guessing. And well, so far nobody has been able to do that!

Bijan Parsia said...

First a general methodological point: Given the triviality and obviousness of your approach, your own self-awareness of your lack of grip on the problem ("I haven't seen the proof for it, but I've also heard that if you could prove that one NP-complete problem does not have a polynomial-time solution to it, then that would prove that none of them have a polynomial-time solution to them."), then the overwhelming odds are that your approach is hugely flawed. A good exercise is to try to break it yourself. Indeed, in general, this is a better approach than trying to defend it against other people breaking it (though that's not unreasonable). Confirmation bias alone is a big danger and you learn more about what makes the problem and your proposed solution tick by finding flaws.

There are a lot of problems with the proof (many have been pointed out), but I'd start with:

"Hence, I can verify a solution in polynomial-time (it takes me O(n) number of int comparisons to verify a solution)."

There are a number of issues other commenters have raise, but, here's the one I think is key:

Who is doing the verifying? If it's the person (really *process*) with antecedent knowledge of the answer, then they aren't really verifying the the answer, they are doing a look up. Let P be the proposed solver of this problem, V be an independent verifier and Y be you, the person who made up the number. V is independent in the sense that there is no communication between V and Y other than the knowledge of what the problem is (i.e., guess a randomly generated number).

P provides a possible answer...how does V verify it? It can't! There's no verifying that the answer is right because there's no way to tell what the right answer is except by asking Y. Once we get an answer from Y we can compare answers in linear time. But V doesn't have any way of knowing that Y didn't lie! Thus it cannot verify the answer.

Imagine there were Y1 and Y2, and let N=2. It's highly likely that Y1 and Y2 will generate different tuples and suppose that is the case. Now P is told, "Solve this problem for N=2". Which answer is the right answer such that V should confirm it? Y1's or Y2's? There's no being right here. Thus this isn't the right kind of problem. (Note, we know of problems that are strictly harder than NP-COMPLETE problems. So you have to be careful that you haven't shown that one of them doesn't have a polynomial solution...that would just tell us what we already know.)

Shannon -jj Behrens said...

Guys, thanks for your comments. It'll take me a while to digest them all.

Bijan, "stupidly clever" is what I was aiming for. The proof for the halting problem was "stupidly clever."

Bijan Parsia said...

Er...I definitely wouldn't characterize either the halting problem or its proof as "stupidly clever". If you look at the history of early 20th century logic (for example) it fits in (albeit as a serious advance/result). See Goedel's incompleteness theorem.

I don't know how else to put that I think your approach, esp. in the way you double down in this last comment, is not likely to lead you to a good understanding of the issues (or of complexity theory). You also are running the risk of kooking out. ("Trying to be stupidly clever...like the halting problem" is pretty close to "I have a proof that special relativity is false" or taking your hot cousin to the prom or "I'm like Galileo and you are like the pope, oppressing my scientific geniusness!!!" I don't think you are *there*, just uncomfortably close by.)

I think the P = NP question is one of the most amazing and fun things to explore, so I strongly encourage you to do so. I just rather suspect that the current approach is particularly efficient (e.g., afaict, you aren't even in the ballpark; cf my other comment).

I find the Gary and Johnson complexity book to be really nice for getting into the groove.

Shannon -jj Behrens said...

Bijan, thanks for the book recommendation. I'll try my best not to kook out. I don't have any problem with special relativity. When I say "stupidly clever", but what I mean is that sometimes a hard problem has a simple, but clever solution. "Elegant" is probably a better word. The reason I haven't invested the time to learn complexity theory properly is that I just don't have enough free time to do it right now. The reason I posted a totally half-baked, uninformed solution on my blog is that if it did happen to have some nugget of usefulness, someone else would probably run with it, and if it didn't happen to have any nugget of usefulness, I don't mind looking stupid sometimes--it's good to keep my ego in check. Also, I knew there were smarter people than me who read my blog who might be able to explain things to me in terms even I could understand.

Bijan Parsia said...

Fair enough.

Kelly Yancey said...

Hi JJ, just to chime in, I think your assertion that it takes "O(A^n) for some constant A" operations to find the answer is incorrect. Consider that to guess one 32-bit integer, you need a maximum of 32 operations, thanks to the power of binary search. So that is constant. So if you had 2 32-bit ints to guess, you would need 2*32 operations; for 3 32-bit ints, you would need 3*32 operations at max. So, for N 32-bit ints, you could guess the answer in N*32 operations. So, in big-O notation, it takes O(N) operations to find the answer; it isn't polynomial time, but linear with N.

Shannon -jj Behrens said...

Kelly, thanks for your comment. Two points:

* I don't think you can do a binary search since the response is "yes" or "no", not "you're too high" or "you're too low".

* It was a mistake to use 32 bit unsigned ints. Vincent pointed out the flaw in that. I'd suggest using an N-tuple of arbitrarily large whole numbers.