I'm new to neuroscience research and came across a term called negative spikes (sometimes also used in conjunction with the term biphasic spike), but could not find what it meant from scientific papers. Can a spike occur in a negative direction (hyperpolarization from the resting potential first, followed by depolarization) or does this term mean something else entirely?
Such a terminology most likely refers to extracellularly detected action potentials. Depending on the geometrical proximity as well as the interface properties of the recording probe (e.g. a metal microelectrode in the electrolyte surrounding the neuron), and depending on the neuronal compartment that is being closely detected (e.g. the soma, the axon, the dendrite), the extracellular waveform detected might be with positive or negative peak.
Such a "polarity" indicates on a first approximation the redistribution of the ions outside the cell, when e.g. an action potential was fired by the cell.
An action potential cannot occur with anything other than the usual depolarisation up to 50mV and subsequent repolarization to ~-70mV.
Related
I am currently working on coding sensor fusion of a wheel based robot pose from GPS, Lidar, Vision and Vehicle measure. Its model is basic kinematics using EKF and no discrimination against sensors i.e. data comes in based on time stamp.
I have difficulty to fuse those sensors due to following issue;
Sometimes when the latest incoming data comes in from different sensor from a sensor gave previous state, the latest pose of the robot comes in behind previous pose. Therefore data fusion does not get so smooth and zigzag-ed as a result.
I would like discard data which plots behind/backwards of the previous data and take data which poses always forward/ahead of previous state even when sensor to provide the data changes between timestamp t and timestamp t+1. Since the data frame is global frame, it is impossible to rely on its x coordinate in minus to achieve this.
Please let me know if you had some idea on this. Thank you so much in advance.
Best,
Preliminary warning
Let me slip here a warning before suggesting posible solutions to your problem: be careful with discarding data based on your current estimate, since you never know if last measure is "pulling pose back" or previous one was wrong and caused your estimate to move forward too much.
Posible solutions
In a Kalman-like filter, observations are assumed to provide independent, uncorrelated information about state vector variables. These observations are assumed to have a random error distributed as a zero mean gaussian variable. Real life is harder, though :-(
Sometimes, measures are affected by a "bias" (a fixed term, similar to the gaussian error having a non-zero mean). e.g. tropospheric perturbations are known to introduce a position error in GPS fixes that drifts slowly over time.
If you take several sensors observing the same variable, as GPS and Lidar for for position, but they have different biases, your estimation will be jumping back and forth. Scaling problems can have a similar effect.
I will assume this is the root of your problem. If not, please refine your question.
How can you mitigate this problem? I see several alternatives:
Introduce a bias/scale correction term in your state vector to compensate sensor bias/drift. This is a very common trick in EKFs for inertial sensor fusion (gyro/accelerometer), that can work nice when tuned properly.
Apply some preprocessing to sensory inputs to correct known problems. It can be difficult to tune a filter for estimating state vector and sensor parameters at the same time.
Change how observations are interpreted. For example, use difference between consecutive position observations so that you are creating a fake odometer sensor. This greatly reduces the drift problem.
Post-process your output. Instead of discarding observations, integrate them and keep the "jumping" state vector internally, but smooth the output vector to eliminate the jumps. This is done in some UAV autopilots because such jumps affect the performance of PID controllers.
Finally, the most obvious and simple approach: discard observations based on some statistical test. A chi-square test of the residual can be used to determine if an observation is too far from expected values and must be discarded. Be careful with this options, though: observation rejection schemes must be completed with a state vector reinitialization logic to resutl in a stable behavior.
Almost all these solutions require knowning the source of each observation, so you would no longer be able to treat them indistinctly.
I’ve been reading a bit about feature hashing for dimensionality reduction. I understand that it’s important to use a hash function that has a uniform output distribution (the chance of an input being mapped to a specific value is that same as every other value in the range), as well an avalanche/cascade effect (a small change in input produces a big change in output). These properties will ensure that collisions between features will be independent of their frequency. However, I’m still unclear on how the avalanche effect (specifically) impacts this. Could anyone explain why/how it matters here? What constitutes a ‘big change’ in output?
References:
http://blog.someben.com/2013/01/hashing-lang/
http://metaoptimize.com/qa/questions/6943/what-is-the-hashing-trick#6945
The idea is that if you have a tight cluster of input data, you still want the hashing function to splatter the outputs all over the map. The effect is that a collision will be a uniformly random event, as opposed to that tight cluster giving you a spate of collisions -- or a spate of collisions with the mappings of another tight cluster.
"Big change" suggests that your hashing function, h, should show that h(a) - h(b) is stochastically independent of (a-b).
Is that enough? Follow up if you need more explanation.
The avalanche effect ensures that a tiny change in the input (e.g. words: cloud vs clouds) will produce a big change in the output, that is, that close input values will produce distant and unpredictable output values.
As far as I understand, admissibility for a heuristic is staying within bounds of the 'actual cost to distance' for a given, evaluated node. I've had to design some heuristics for an A* solution search on state-spaces and have received a lot of positive efficiency using a heuristic that may sometimes returns negative values, therefore making certain nodes who are more 'closely formed' to the goal state have a higher place in the frontier.
However, I worry that this is inadmissible, but can't find enough information online to verify this. I did find this one paper from the University of Texas that seems to mention in one of the later proofs that "...since heuristic functions are nonnegative". Can anyone confirm this? I assume it is because returning a negative value as your heuristic function would turn your g-cost negative (and therefore interfere with the 'default' dijkstra-esque behavior of A*).
Conclusion: Heuristic functions that produce negative values are not inadmissible, per se, but have the potential to break the guarantees of A*.
Interesting question. Fundamentally, the only requirement for admissibility is that a heuristic never over-estimates the distance to the goal. This is important, because an overestimate in the wrong place could artificially make the best path look worse than another path, and prevent it from ever being explored. Thus a heuristic that can provide overestimates loses any guarantee of optimality. Underestimating does not carry the same costs. If you underestimate the cost of going in a certain direction, eventually the edge weights will add up to be greater than the cost of going in a different direction, so you'll explore that direction too. The only problem is loss of efficiency.
If all of your edges have positive costs, a negative heuristic value can only over be an underestimate. In theory, an underestimate should only ever be worse than a more precise estimate, because it provides strictly less information about the potential cost of a path, and is likely to result in more nodes being expanded. Nevertheless, it will not be inadmissible.
However, here is an example that demonstrates that it is theoretically possible for negative heuristic values to break the guaranteed optimality of A*:
In this graph, it is obviously better to go through nodes A and B. This will have a cost of three, as opposed to six, which is the cost of going through nodes C and D. However, the negative heuristic values for C and D will cause A* to reach the end through them before exploring nodes A and B. In essence, the heuristic function keeps thinking that this path is going to get drastically better, until it is too late. In most implementations of A*, this will return the wrong answer, although you can correct for this problem by continuing to explore other nodes until the greatest value for f(n) is greater than the cost of the path you found. Note that there is nothing inadmissible or inconsistent about this heuristic. I'm actually really surprised that non-negativity is not more frequently mentioned as a rule for A* heuristics.
Of course, all that this demonstrates is that you can't freely use heuristics that return negative values without fear of consequences. It is entirely possible that a given heuristic for a given problem would happen to work out really well despite being negative. For your particular problem, it's unlikely that something like this is happening (and I find it really interesting that it works so well for your problem, and still want to think more about why that might be).
SOM - Self Organized Map, every input dimension maps to all output nodes, nodes compete with each other for scoring - vector quantization. PCA and other clustering methods can be seen as simplified special cases of this process.
There is only ever a single winning node in a SOM. However, what happens when an input strongly resembles two established 'clusters'? Could it so happen that the first neuron wins over a second neuron by a small margin and yet the two are very far apart? If so, would it not also be extremely useful information?
If so, then it means the entire activation pattern with all its various outputs would be useful in classifying an input.
The reason I'm asking is because I'm considering plugging SOMs into other neural networks and then maybe back again into SOMs. And when plugging in, I wish to know if it would be safe to just carry over the entire lattice with all its outputs instead of just the winning node.
I have tried checking the math of the SOM, when training it only considers the winning neuron, but nothing seems to indicate that if a new input is used, only the winning node is of importance to the operator.
The goal of the algorithm at the end of training is to have the first and second winning nodes of each input pattern in adjacent positions in the lattice. This is referred as Topology Preservation of the input data space. The inverse case is considered as bad training and is calculated by the topological error. One simple measure of this error is the ratio of input vectors for which the first and second winning nodes are not adjacent.
Search for SOM and topology preservation.
Here is a quick link .
Keep in mind that small maps generally produce a smaller topological error but increased quantization error where larger maps tend to inverse this situation. So there is a trade of between topology preservation and quantization accuracy. There isn't a golden rule for this. It always depends on the domain, the application and the expected results.
So I guess this isn't technically a code question, but it's something that I'm sure will come up for other folks as well as myself while writing code, so hopefully it's still a good one to post on SO.
The Google has directed me to plenty of nice lengthy explanations of when to use one or the other as regards financial numbers, and things like that.
But my particular context doesn't fit in, and I'm wondering if anyone here has some insight. I need to take a whole bunch of individual users' votes on how "good" a particular item is. I.e., some number of users each give a particular item a score between 0 and 10, and I want to report on what the 'typical' score is. What would be the intuitive reasons to report the geometric and/or arithmetic mean as the typical response?
Or, for that matter, would I be better off reporting the median instead?
I imagine there's some psychology involved in what the "best" method might be...
Anyway, there you have it.
Thanks!
Generally speaking, the arithmetic mean will suffice. It is much less computationally intensive than the geometric mean (which involves taking an n-th root).
As for the psychology involved, the geometric mean is never greater than the arithmetic mean, so arithmetic is the best choice if you'd prefer higher scores in general.
The median is most useful when the data set is relatively small and the chance of a massive outlier relatively high. Depending on how much precision these votes can take, the median can sometimes end up being a bit arbitrary.
If you really really want the most accurate answer possible, you could go for calculating the arithmetic-geomtric mean. However, this involved calculating both arithmetic and geometric means repeatedly, so it is very computationally intensive in comparison.
you want the arithmetic mean. since you aren't measuring the average change in average or something.
Arithmetic mean is correct.
Your scale is artificial:
It is bounded, from 0 and 10
8.5 is intuitively between 8 and 9
But for other scales, you would need to consider the correct mean to use.
Some other examples
In counting money, it has been argued that wealth has logarithmic utility. So the median between Bill Gates' wealth and a bum in the inner city would be a moderately successful business person. (Arithmetic average would hive you Larry Page.)
In measuring sound level, decibels already normalizes the effect. So you can take arithmetic average of decibels.
But if you are measuring volume in watts, then use quadratic means (RMS).
The answer depends on the context and your purpose. Percent changes were mentioned as a good time to use geometric mean. I use geometric mean when calculating antennas and frequencies since the percentage change is more important than the average or middle of the frequency range or average size of the antenna is concerned. If you have wildly varying numbers, especially if most are similar but one or two are "flyers" (far from the range of the others) the geometric mean will "smooth" the results (not let the different ones exert a change in the results more than they should). This method is used to calculate bullet group sizes (the "flyer" was probably human error, not the equipment, so the average is ""unfair" in that case). Another variation similar to geometric mean is the root mean square method. First you take the square root of the numbers, take THAT mean, and then square your answer (this provides even more smoothing). This is often used in electrical calculations and most electical meters are calculated in "RMS" (root mean square), not average readings. Hope this helps a little. Here is a web site that explains it pretty well. standardwisdom.com