13
头图

Some time ago, a small partner in the group asked me to help me to troubleshoot an online problem. I thought the investigation process was interesting. I wanted to record it to see if it could help other students, so I wrote this article.

The cause of the incident was that a few days ago, an alarm was suddenly received online, and the error content was TypeError: C.fn is not a function . Relevant students tried to troubleshoot to no avail, and then rolled back the recently launched changes, but they did not troubleshoot the problem. Although the reproduction path was finally confirmed, it could not be reproduced locally.

🔍 Preliminary investigation

After reproducing the error online, click the file jump in the error stack to quickly locate the error code online. Since the online is all compressed code, here we can click {} in the lower left corner to beautify the code.

After beautification, we can see that there should be a problem with line 189624. Let's just try to break on this line, and then we'll find that the code spins like crazy in this block. This is because it is in a for loop. It is not difficult to see that the code is actually a recursive execution of the chain this.head . Each time the current C is executed, it will be assigned the next value of the chain, and the fn() method corresponding to this value will be executed. That is to say, the problem is that there is no fn() method for a certain value on this chain, which eventually leads to this error.

After probably confirming the problem, we need to see what the final value of C is. Because it is in a loop, it is really troublesome to click the next step again and again. Since we have a clear goal, we can try to add conditional breakpoints, so that only the breakpoints that meet our conditions will stop, otherwise normal execution will be ignored.

Right-click on the Add conditional breakpoint... option on line 189624 and enter typeof C.fn !== 'function' as the conditional expression. This way we implement a conditional breakpoint that only fires if C.fn is not a method.

After the conditional breakpoint is triggered, we can debug in the console based on the context output variable at the time of the breakpoint. From the left picture below, we can clearly see that C.fn does not exist at this time.

Since we already know that this.head should be a chain, the methods on the chain are executed in turn. So in theory every element on the chain is the same. So I tried to output all the elements on the this.head chain to see what this chain looks like. I also tried to write the loop in the simulation code on the console, and found that the output results are shown in the left figure below. The last element in the chain is our problematic element.

What we know before is that this problem cannot be reproduced in the local development environment, so I also output the this.head chain in the same location locally, and the result is shown in the picture on the right. The output of the discovery and online output is basically the same except for the last problematic element.

It seems that the cause of the problem is that the online code execution adds such a thing to the chain, and the local does not trigger the problem because there is no such extra element.

🐞 Confirm the problem

After finding the reason, I thought about where such a thing was added from the code level. Since the words i.prototype.finish can be clearly seen in the previous code, a preliminary guess is that this should be a class definition. So I want to see where this class is instantiated and executed.

Through the compressed code when the error was just reported, we can see that the module that reported the error is the "protobuf.js" module. So I looked in the project and dependencies to find which module depends on it, and finally found that an IM message module we used internally was useful.

Then search for the words .finish() in the specific dependency module, and find the final call in the following place. serialize() method calls Request.encode() method, which returns an instance of the $Writer base class, which is the $Writer base class in the protobuf.js Writer . After instantiating the Writer base class, the Request.encode() method will execute a series of member functions. After the execution, it will return Writer instance and call its finish() method.

After understanding the execution process, I followed the sentence Request.encode(req).finish() and started to breakpoint the Request.encode() method upwards (left picture below). As shown in the figure below, first try to output o.head at the end breakpoint ( o is the variable that points to the Writer instance after compression), and find that there is already an exception chain element at this time (right below).

The code in the middle hit a breakpoint and found that it was still the same. Finally found a clue at the head breakpoint. After trying to add a power outage at the beginning, it was found that abnormal data already existed on the o.head chain after the execution of line 120274.

Then let's try to look at the code to see what o.create() method does. From the left of the figure below, we can see that the essence of Writer.create() is actually the instantiation factory method of the base class of Writer . In the figure below, you can see that the construction method of Writer assigns initial values to some member attributes. The key this.head initial value is an instance of the Op base class. As you can see on the right of the figure below, some initial values are also assigned in the construction method of the Op base class. At the same time we can see that function noop() {} is actually an empty method. That is to say this.head points to a Op object instantiated by an empty method by default.

At first glance, the whole process is actually very simple. In essence, there are some simple assignment operations in the constructor, and there will be no problem. Therefore, it is still necessary to troubleshoot the problem according to the link. Because we checked that there was a problem after executing the Writer.create() factory method, so here we need to check the breakpoint of the constructor of Writer .

Try to output this.head chain after the breakpoint at the end of the construction method as shown below, and find that there is already abnormal data at this time. At this time, it is just the operation of the initial value, how can there be a problem? Since I can debug in the current context in the case of breakpoints, I try to perform the instantiation of the Op base class myself at this time (see the figure below). At this time, it is found that its next attribute is wrong, and it is the problem element we are looking for!

At this moment, I feel that we are getting closer to the truth!

As shown on the left of the figure below, we hover on the f variable for a while, and the link to its definition will appear. After clicking, it will jump directly to its definition on the right of the figure below (in fact, it is not too far away).

You may have noticed that in the code we just looked at, this.next is clearly defined as undefined Why is it defined as g here? And this g the 189456 line g = s.base64 , so we see that the value of this.head.next is so strange. And we try to look at the referenced protobuf.js code and find that although this.next is equal to g in the code, it is not associated with u.base64 .

Since I have solved some cases of abnormal code after compression and compression, so far I can basically conclude that because protobuf.js is the compressed code introduced in our dependencies, and the compressed code is compressed again, resulting in variables Point to the problem caused by the confusion. This also confirms the reason why it is only available online and cannot be reproduced locally. Because the local is not compressed.

🛠 How to fix

After finding the problem, there are two solutions. One is to find the cause of this problem with the compression tool in a forward direction; the other is to avoid this problem in a reverse direction. We do not introduce compressed code but normally introduce uncompressed code, and finally the project will compress it uniformly.

Both methods will solve the problem. The first method will take a long time, so we first use the second method to temporarily solve it. Since this dependency package is not maintained by us, we can only use patch-package to patch the module to fix it. Its function is to modify the dependencies according to our diff file after installing the dependencies.

Our modification here is relatively simple. Find the place where our dependent module is introduced into protobuf.min.js and modify it to protobuf.js .

🗒 Afterword

The initial guess that g becomes undefined after compression should be that the local wants to define an undefined variable, which is undefined . I tried to clone the protobuf.js warehouse and tried it, and found that marguel.eval should be configured in UglifyJS to have this feature.

The above is the complete investigation process of the blood case caused by the compression. The whole process can be summarized as the following experiences for your reference:

  1. In addition to single-step breakpoints, we also have conditional breakpoints, log breakpoints and other breakpoints to help us troubleshoot problems. Reasonable use will speed up our troubleshooting.
  2. After the breakpoint, the current JS environment will stay in the current context, and we can execute it in the console and output the data of the current environment we want to help troubleshoot.
  3. In the console, we can also hover to view the definition position and quickly jump between definitions.
  4. The compressed code is not terrible. We can locate and search for keywords that cannot be compressed through source code comparison.
  5. As long as it's a reproducible problem, it's not a problem!

Finally, I wish everyone good luck in starting work, and there will be no bugs in the new year!


公子
36.6k 声望7.5k 粉丝

额米那个陀佛,无量那个天尊!