Using a stateful Keras model in pure TensorFlow - machine-learning

I have a stateful RNN model with several GRU layers that was created in Keras.
I have to run this model now from Java, so I dumped the model as protobuf, and I'm loading it from Java TensorFlow.
This model must be stateful because features will be fed one timestep at-a-time.
As far as I understand, in order to achieve statefulness in a TensorFlow model, I must somehow feed in the last state every time I execute the session runner, and also that the run would return the state after the execution.
Is there a way to output the state in the Keras model?
Is there a simpler way altogether to get a stateful Keras model to work as such using TensorFlow?
Many thanks

An alternative solution is to use the model.state_updates property of the keras model, and add it to the session.run call.
Here is a full example that illustrates this solutions with two lstms:
import tensorflow as tf
class SimpleLstmModel(tf.keras.Model):
""" Simple lstm model with two lstm """
def __init__(self, units=10, stateful=True):
super(SimpleLstmModel, self).__init__()
self.lstm_0 = tf.keras.layers.LSTM(units=units, stateful=stateful, return_sequences=True)
self.lstm_1 = tf.keras.layers.LSTM(units=units, stateful=stateful, return_sequences=True)
def call(self, inputs):
"""
:param inputs: [batch_size, seq_len, 1]
:return: output tensor
"""
x = self.lstm_0(inputs)
x = self.lstm_1(x)
return x
def main():
model = SimpleLstmModel(units=1, stateful=True)
x = tf.placeholder(shape=[1, 1, 1], dtype=tf.float32)
output = model(x)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
res_at_step_1, _ = sess.run([output, model.state_updates], feed_dict={x: [[[0.1]]]})
print(res_at_step_1)
res_at_step_2, _ = sess.run([output, model.state_updates], feed_dict={x: [[[0.1]]]})
print(res_at_step_2)
if __name__ == "__main__":
main()
Which produces the following output:
[[[0.00168626]]]
[[[0.00434444]]]
and shows that the lstm state is preserved between batches.
If we set stateful to False, the output becomes:
[[[0.00033928]]]
[[[0.00033928]]]
Showing that the state is not reused.

ok, so I managed to solve this problem!
What worked for me was creating tf.identity tensors for not only the outputs, as is standard, but also for the state tensors.
In the Keras models, the state tensors can be found by doing:
model.updates
Which gives something like this:
[(<tf.Variable 'gru_1_1/Variable:0' shape=(1, 70) dtype=float32_ref>,
<tf.Tensor 'gru_1_1/while/Exit_2:0' shape=(1, 70) dtype=float32>),
(<tf.Variable 'gru_2_1/Variable:0' shape=(1, 70) dtype=float32_ref>,
<tf.Tensor 'gru_2_1/while/Exit_2:0' shape=(1, 70) dtype=float32>),
(<tf.Variable 'gru_3_1/Variable:0' shape=(1, 4) dtype=float32_ref>,
<tf.Tensor 'gru_3_1/while/Exit_2:0' shape=(1, 4) dtype=float32>)]
The 'Variable' is used for inputting the states, and the 'Exit' for outputs of the new states.
So I created tf.identity out of the 'Exit' tensors. I gave them meaningful names, e.g.:
tf.identity(state_variables[j], name='state'+str(j))
Where state_variables contained only the 'Exit' tensors
Then used the input variables (e.g. gru_1_1/Variable:0) to feed the model state from TensorFlow, and the identity variables I created out of the 'Exit' tensors were used to extract the new states after feeding the model at each timestep

Related

Pass information between pipeline steps in sklearn

I am working on a simple text generation problem with LSTMs. To make the preprocessing more compact and reproducible, I decided to implement everything in sklearn fashion, using custom sklearn transformers, and the KerasClassifier from scikeras to wrap the neural network definition in a sklearn-type estimator.
It almost works but I can't figure out how to pass information from within a certain custom transformer on to the KerasClassifier estimator. More precisely, for the method that creates the neural network, I need the number of outputs as an argument; but this depends on the number of words in the fitted vocabulary - which is an information that is currently encapsulated in ModelEncoder class.
(Note that in order to get the current logic work, I had to slightly modify the default sklearn Pipeline class, as it wouldn't allow modifying and returning both X and y. In other words, the default sklearn Pipeline only allows feature transformations but not target transformations. Modifying the custom Pipeline class was explained in this StackOverflow post.)
Example data:
train_data = ['o by no means honest ventidius i gave it freely ever and theres none can truly say he gives if our betters play at that game we must not dare to imitate them faults that are rich are fair'
'but was not this nigh shore'
'impairing henry strengthening misproud york the common people swarm like summer flies and whither fly the gnats but to the sun'
'what while you were there'
'chill pick your teeth zir come no matter vor your foins'
'thanks dear isabel' 'come prick me bullcalf till he roar again'
'go some of you knock at the abbeygate and bid the lady abbess come to me'
'an twere not as good deed as drink to break the pate on thee i am a very villain'
'beaufort it is thy sovereign speaks to thee'
'but say lucetta now we are alone wouldst thou then counsel me to fall in love'
'for being a bawd for being a bawd'
'all blest secrets all you unpublishd virtues of the earth spring with my tears'
'what likelihood' 'o find him']
Full code:
# Modify the sklearn Pipeline class to allow it to return tuples and hence enable both X and y modifications. (Current default implementation in sklearn only allows
# feature transformations, i.e. transformations on X, but not on y.)
class Pipeline(pipeline.Pipeline):
def _fit(self, X, y=None, **fit_params_steps):
self.steps = list(self.steps)
self._validate_steps()
memory = check_memory(self.memory)
fit_transform_one_cached = memory.cache(pipeline._fit_transform_one)
for (step_idx, name, transformer) in self._iter(
with_final=False, filter_passthrough=False
):
if transformer is None or transformer == "passthrough":
with _print_elapsed_time("Pipeline", self._log_message(step_idx)):
continue
try:
# joblib >= 0.12
mem = memory.location
except AttributeError:
mem = memory.cachedir
finally:
cloned_transformer = clone(transformer) if mem else transformer
X, fitted_transformer = fit_transform_one_cached(
cloned_transformer,
X,
y,
None,
message_clsname="Pipeline",
message=self._log_message(step_idx),
**fit_params_steps[name],
)
if isinstance(X, tuple): ###### unpack X if is tuple X = (X,y)
X, y = X
self.steps[step_idx] = (name, fitted_transformer)
return X, y
def fit(self, X, y=None, **fit_params):
fit_params_steps = self._check_fit_params(**fit_params)
Xt = self._fit(X, y, **fit_params_steps)
if isinstance(Xt, tuple): ###### unpack X if is tuple X = (X,y)
Xt, y = Xt
with _print_elapsed_time("Pipeline", self._log_message(len(self.steps) - 1)):
if self._final_estimator != "passthrough":
fit_params_last_step = fit_params_steps[self.steps[-1][0]]
self._final_estimator.fit(Xt, y, **fit_params_last_step)
return self
class ModelTokenizer(TransformerMixin, BaseEstimator):
def __init__(self, max_len=100):
super().__init__()
self.max_len = max_len
def fit(self, X=None, y=None):
return self
def transform(self, X, y=None):
X_flattened = " ".join(X).split()
sequences = list()
for i in range(self.max_len+1, len(X_flattened)):
seq = X_flattened[i-self.max_len-1:i]
sequences.append(seq)
return sequences
class ModelEncoder(TransformerMixin, BaseEstimator):
def __init__(self):
super().__init__()
self.tokenizer = Tokenizer()
def fit(self, X=None, y=None):
self.tokenizer.fit_on_texts(X)
return self
def transform(self, X, y=None):
encoded_sequences = np.array(self.tokenizer.texts_to_sequences(X))
return (encoded_sequences[:,:-1], encoded_sequences[:,-1])
def create_nn(input_shape=(100,1), output_shape=None):
model = Sequential()
model.add(LSTM(64, input_shape=input_shape, return_sequences=True))
model.add(Dropout(0.3))
model.add(Flatten())
model.add(Dense(20, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(output_shape, activation='softmax'))
metrics_list = [tf.keras.metrics.BinaryAccuracy(name='accuracy')]
model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = metrics_list)
return model
pipe = Pipeline([
('tokenizer', ModelTokenizer()),
('encoder', ModelEncoder()),
('model', KerasClassifier(build_fn=create_nn, epochs=10, output_shape=vocab_size)),
])
# Question: how to pass 'vocab_size'?
Imports:
from sklearn import pipeline
from sklearn.base import clone
from sklearn.utils import _print_elapsed_time
from sklearn.utils.validation import check_memory
from sklearn.base import BaseEstimator, TransformerMixin
from keras.preprocessing.text import Tokenizer
from scikeras.wrappers import KerasClassifier
KerasClassifier has its own internal transformer (see here, it is used to provide one-hot encoding and such) which has an API to pass metadata to the model (see here, that's how arguments such as n_outputs_ are passed into the model building function). Could you override that to pass this extra metadata to the model? It's stepping a bit outside of the Scikit-Learn API, but as you've noted the Scikit-Learn API doesn't have this functionality built in. If you want to propagate that information from a Transformer in your pipeline into SciKeras you could encode it into a feature and then use the above-mentioned hooks along with a custom encoder to remove that feature and convert it into metadata that can be passed into the model, but now you'd be really pushing the Scikit-Learn API.

How can I use an LSTM to classify a series of vectors into two categories in Pytorch

I have a series of vectors representing a signal over time. I'd like to classify parts of the signal into two categories: 1 or 0. The reason for using LSTM is that I believe the network will need knowledge of the entire signal to classify.
My problem is developing the PyTorch model. Below is the class I've come up with.
class LSTMClassifier(nn.Module):
def __init__(self, input_dim, hidden_dim, label_size, batch_size):
self.lstm = nn.LSTM(input_dim, hidden_dim)
self.hidden2label = nn.Linear(hidden_dim, label_size)
self.hidden = self.init_hidden()
def init_hidden(self):
return (torch.zeros(1, self.batch_size, self.hidden_dim),
torch.zeros(1, self.batch_size, self.hidden_dim))
def forward(self, x):
lstm_out, self.hidden = self.lstm(x, self.hidden)
y = self.hidden2label(lstm_out[-1])
log_probs = F.log_softmax(y)
return log_probs
However this model is giving a bunch of shape errors, and I'm having trouble understanding everything going on. I looked at this SO question first.
You should follow PyTorch documentation, especially inputs and outputs part, always.
This is how the classifier should look like:
import torch
import torch.nn as nn
class LSTMClassifier(nn.Module):
def __init__(self, input_dim, hidden_dim, label_size):
super().__init__()
self.lstm = nn.LSTM(input_dim, hidden_dim, batch_first=True)
self.hidden2label = nn.Linear(hidden_dim, label_size)
def forward(self, x):
_, (h_n, _) = self.lstm(x)
return self.hidden2label(h_n.reshape(x.shape[0], -1))
clf = LSTMClassifier(100, 200, 1)
inputs = torch.randn(64, 10, 100)
clf(inputs)
Points to consider:
always use super().__init__() as it registers modules in your neural networks, allows for hooks etc.
Use batch_first=True so you can pass inputs of shape (batch, timesteps, n_features)
No need to init_hidden with zeros, it is the default value if left uninitialized
No need to pass self.hidden each time to LSTM. Moreover, you should not do that. It means that elements from each batch of data are somehow next steps, while batch elements should be disjoint and you probably do not need that.
_, (h_n, _) returns last hidden cell from last timestep, exactly of shape: (num_layers * num_directions, batch, hidden_size). In our case num_layers and num_directions is 1 so we get (1, batch, hidden_size) tensor as output
Reshape to (batch, hidden_size) so it can be passed through linear layer
Return logits without activation. Only one if it is a binary case. Use torch.nn.BCEWithLogitsLoss as loss for binary case and torch.nn.CrossEntropyLoss for multiclass case. Also sigmoid is proper activation for binary case, while softmax or log_softmax is appropriate for multiclass.
For binary only one output is needed. Any value below 0 (if returning unnormalized probabilities as in this case) is considered negative, anything above positive.

Transfer learning with CNTK and pre-trained ONNX model fails

I'm trying to use the ResNet-50 model from the ONNX model zoo and load and train it in CNTK for an image classification task. The first thing that confuses me is, that the batch axis (not sure what's the official name for it, dynamic axis?) is set to 1 in this model:
Why is that? Couldn't it simply be [3x224x224]? In this model for example, the input looks like this:
To load the model and use my own Dense layer, I use the following code:
def create_model(num_classes, input_features, freeze=False):
base_model = load_model("restnet-50.onnx", format=ModelFormat.ONNX)
feature_node = find_by_name(base_model, "gpu_0/data_0")
last_node = find_by_uid(base_model, "Reshape2959")
substitutions = {
feature_node : placeholder(name='new_input')
}
cloned_layers = last_node.clone(CloneMethod.clone, substitutions)
cloned_out = cloned_layers(input_features)
z = Dense(num_classes, activation=softmax, name="prediction") (cloned_out)
return z
For training I use (shortened):
# datasets = list of classes
feature = input_variable(shape=(1, 3, 224, 224))
label = input_variable(shape=(1,3))
model = create_model(len(datasets), feature)
loss = cross_entropy_with_softmax(model, label)
# some definitions for learner, epochs, ProgressPrinters missing
for epoch in range(epochs):
loss.train((X_current,y_current), parameter_learners=[learner], callbacks=[progress_printer])
X_current is a single image and y_current the corresponding class label both encoded as numpy arrays with the followings shapes
X_current.shape
(1, 3, 224, 224)
y_current.shape
(1, 3)
When I try to train the model, I get
"ValueError: ToBatchAxis7504 ToBatchAxisNode operation can only operate on tensor without minibatch data (no layout)"
What's wrong here?

TypeError: 'Tensor' object is not callable

I'm trying to display the output of each layer of the convolutions neural network.
The backend I'm using is TensorFlow.
Here is the code:
import ....
from keras import backend as K
model = Sequential()
model.add(Convolution2D(32, 3, 3, input_shape = (1,28,28)))
convout1 = Activation('relu')
model.add(convout1)
(X_train, y_train), (X_test, y_test) = mnist_dataset = mnist.load_data("mnist.pkl")
reshaped = X_train.reshape(X_train.shape[0], 1, X_train.shape[1], X_train.shape[2])
from random import randint
img_to_visualize = randint(0, len(X_train) - 1)
# Generate function to visualize first layer
# ERROR HERE
convout1_f = K.function([model.input(train=False)], convout1.get_output(train=False)) #ERROR HERE
convolutions = convout1_f(reshaped[img_to_visualize: img_to_visualize+1])
The full Error is:
convout1_f = K.function([model.input(train=False)],
convout1.get_output(train=False)) TypeError: 'Tensor' object is not
callable
Any comment or suggestion is highly appreciated. Thank you.
Both get_output and get_input methods return either Theano or TensorFlow tensor. It's not callable because of the nature of this objects.
In order to compile a function you should provide only layer tensors and a special Keras tensor called learning_phase which sets in which option your model should be called.
Following this answer your function should look like this:
convout1_f = K.function([model.input, K.learning_phase()], convout1.get_output)
Remember that you need to pass either True or False when calling your function in order to make your model computations in either learning or training phase mode.

Put customized functions in Sklearn pipeline

In my classification scheme, there are several steps including:
SMOTE (Synthetic Minority Over-sampling Technique)
Fisher criteria for feature selection
Standardization (Z-score normalisation)
SVC (Support Vector Classifier)
The main parameters to be tuned in the scheme above are percentile (2.) and hyperparameters for SVC (4.) and I want to go through grid search for tuning.
The current solution builds a "partial" pipeline including step 3 and 4 in the scheme clf = Pipeline([('normal',preprocessing.StandardScaler()),('svc',svm.SVC(class_weight='auto'))])
and breaks the scheme into two parts:
Tune the percentile of features to keep through the first grid search
skf = StratifiedKFold(y)
for train_ind, test_ind in skf:
X_train, X_test, y_train, y_test = X[train_ind], X[test_ind], y[train_ind], y[test_ind]
# SMOTE synthesizes the training data (we want to keep test data intact)
X_train, y_train = SMOTE(X_train, y_train)
for percentile in percentiles:
# Fisher returns the indices of the selected features specified by the parameter 'percentile'
selected_ind = Fisher(X_train, y_train, percentile)
X_train_selected, X_test_selected = X_train[selected_ind,:], X_test[selected_ind, :]
model = clf.fit(X_train_selected, y_train)
y_predict = model.predict(X_test_selected)
f1 = f1_score(y_predict, y_test)
The f1 scores will be stored and then be averaged through all fold partitions for all percentiles, and the percentile with the best CV score is returned. The purpose of putting 'percentile for loop' as the inner loop is to allow fair competition as we have the same training data (including synthesized data) across all fold partitions for all percentiles.
After determining the percentile, tune the hyperparameters by second grid search
skf = StratifiedKFold(y)
for train_ind, test_ind in skf:
X_train, X_test, y_train, y_test = X[train_ind], X[test_ind], y[train_ind], y[test_ind]
# SMOTE synthesizes the training data (we want to keep test data intact)
X_train, y_train = SMOTE(X_train, y_train)
for parameters in parameter_comb:
# Select the features based on the tuned percentile
selected_ind = Fisher(X_train, y_train, best_percentile)
X_train_selected, X_test_selected = X_train[selected_ind,:], X_test[selected_ind, :]
clf.set_params(svc__C=parameters['C'], svc__gamma=parameters['gamma'])
model = clf.fit(X_train_selected, y_train)
y_predict = model.predict(X_test_selected)
f1 = f1_score(y_predict, y_test)
It is done in the very similar way, except we tune the hyperparamter for SVC rather than percentile of features to select.
My questions are:
In the current solution, I only involve 3. and 4. in the clf and do 1. and 2. kinda "manually" in two nested loop as described above. Is there any way to include all four steps in a pipeline and do the whole process at once?
If it is okay to keep the first nested loop, then is it possible (and how) to simplify the next nested loop using a single pipeline
clf_all = Pipeline([('smote', SMOTE()),
('fisher', Fisher(percentile=best_percentile))
('normal',preprocessing.StandardScaler()),
('svc',svm.SVC(class_weight='auto'))])
and simply use GridSearchCV(clf_all, parameter_comb) for tuning?
Please note that both SMOTE and Fisher (ranking criteria) have to be done only for the training data in each fold partition.
It would be so much appreciated for any comment.
SMOTE and Fisher are shown below:
def Fscore(X, y, percentile=None):
X_pos, X_neg = X[y==1], X[y==0]
X_mean = X.mean(axis=0)
X_pos_mean, X_neg_mean = X_pos.mean(axis=0), X_neg.mean(axis=0)
deno = (1.0/(shape(X_pos)[0]-1))*X_pos.var(axis=0) +(1.0/(shape(X_neg[0]-1))*X_neg.var(axis=0)
num = (X_pos_mean - X_mean)**2 + (X_neg_mean - X_mean)**2
F = num/deno
sort_F = argsort(F)[::-1]
n_feature = (float(percentile)/100)*shape(X)[1]
ind_feature = sort_F[:ceil(n_feature)]
return(ind_feature)
SMOTE is from https://github.com/blacklab/nyan/blob/master/shared_modules/smote.py, it returns the synthesized data. I modified it to return the original input data stacked with the synthesized data along with its labels and synthesized ones.
def smote(X, y):
n_pos = sum(y==1), sum(y==0)
n_syn = (n_neg-n_pos)/float(n_pos)
X_pos = X[y==1]
X_syn = SMOTE(X_pos, int(round(n_syn))*100, 5)
y_syn = np.ones(shape(X_syn)[0])
X, y = np.vstack([X, X_syn]), np.concatenate([y, y_syn])
return(X, y)
scikit created a FunctionTransformer as part of the preprocessing class in version 0.17. It can be used in a similar manner as David's implementation of the class Fisher in the answer above - but with less flexibility. If the input/output of the function is configured properly, the transformer can implement the fit/transform/fit_transform methods for the function and thus allow it to be used in the scikit pipeline.
For example, if the input to a pipeline is a series, the transformer would be as follows:
def trans_func(input_series):
return output_series
from sklearn.preprocessing import FunctionTransformer
transformer = FunctionTransformer(trans_func)
sk_pipe = Pipeline([("trans", transformer), ("vect", tf_1k), ("clf", clf_1k)])
sk_pipe.fit(train.desc, train.tag)
where vect is a tf_idf transformer, clf is a classifier and train is the training dataset. "train.desc" is the series text input to the pipeline.
I don't know where your SMOTE() and Fisher() functions are coming from, but the answer is yes you can definitely do this. In order to do so you will need to write a wrapper class around those functions though. The easiest way to this is inherit sklearn's BaseEstimator and TransformerMixin classes, see this for an example: http://scikit-learn.org/stable/auto_examples/hetero_feature_union.html
If this isn't making sense to you, post the details of at least one of your functions (the library it comes from or your code if you wrote it yourself) and we can go from there.
EDIT:
I apologize, I didn't look at your functions closely enough to realize that they transform your target in addition to your training data (i.e. both X and y). Pipeline does not support transformations to your target so you will have do them prior as you originally were. For your reference, here is what it would look like to write your custom class for your Fisher process which would work if the function itself did not need to affect your target variable.
>>> from sklearn.base import BaseEstimator, TransformerMixin
>>> from sklearn.preprocessing import StandardScaler
>>> from sklearn.svm import SVC
>>> from sklearn.pipeline import Pipeline
>>> from sklearn.grid_search import GridSearchCV
>>> from sklearn.datasets import load_iris
>>>
>>> class Fisher(BaseEstimator, TransformerMixin):
... def __init__(self,percentile=0.95):
... self.percentile = percentile
... def fit(self, X, y):
... from numpy import shape, argsort, ceil
... X_pos, X_neg = X[y==1], X[y==0]
... X_mean = X.mean(axis=0)
... X_pos_mean, X_neg_mean = X_pos.mean(axis=0), X_neg.mean(axis=0)
... deno = (1.0/(shape(X_pos)[0]-1))*X_pos.var(axis=0) + (1.0/(shape(X_neg)[0]-1))*X_neg.var(axis=0)
... num = (X_pos_mean - X_mean)**2 + (X_neg_mean - X_mean)**2
... F = num/deno
... sort_F = argsort(F)[::-1]
... n_feature = (float(self.percentile)/100)*shape(X)[1]
... self.ind_feature = sort_F[:ceil(n_feature)]
... return self
... def transform(self, x):
... return x[self.ind_feature,:]
...
>>>
>>> data = load_iris()
>>>
>>> pipeline = Pipeline([
... ('fisher', Fisher()),
... ('normal',StandardScaler()),
... ('svm',SVC(class_weight='auto'))
... ])
>>>
>>> grid = {
... 'fisher__percentile':[0.75,0.50],
... 'svm__C':[1,2]
... }
>>>
>>> model = GridSearchCV(estimator = pipeline, param_grid=grid, cv=2)
>>> model.fit(data.data,data.target)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/dmcgarry/anaconda/lib/python2.7/site-packages/sklearn/grid_search.py", line 596, in fit
return self._fit(X, y, ParameterGrid(self.param_grid))
File "/Users/dmcgarry/anaconda/lib/python2.7/site-packages/sklearn/grid_search.py", line 378, in _fit
for parameters in parameter_iterable
File "/Users/dmcgarry/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 653, in __call__
self.dispatch(function, args, kwargs)
File "/Users/dmcgarry/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 400, in dispatch
job = ImmediateApply(func, args, kwargs)
File "/Users/dmcgarry/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 138, in __init__
self.results = func(*args, **kwargs)
File "/Users/dmcgarry/anaconda/lib/python2.7/site-packages/sklearn/cross_validation.py", line 1239, in _fit_and_score
estimator.fit(X_train, y_train, **fit_params)
File "/Users/dmcgarry/anaconda/lib/python2.7/site-packages/sklearn/pipeline.py", line 130, in fit
self.steps[-1][-1].fit(Xt, y, **fit_params)
File "/Users/dmcgarry/anaconda/lib/python2.7/site-packages/sklearn/svm/base.py", line 149, in fit
(X.shape[0], y.shape[0]))
ValueError: X and y have incompatible shapes.
X has 1 samples, but y has 75.
You actually can put all of these functions into a single pipeline!
In the accepted answer, #David wrote that your functions
transform your target in addition to your training data (i.e. both X and y). Pipeline does not support transformations to your target so you will have do them prior as you originally were.
It is true that sklearn's pipeline does not support this. However imblearn's pipeline here supports this. The imblearn pipeline is just like that of sklearn but it allows you to call transformations separately on the training and testing data via sample methods. Moreover, these sample methods are actually designed so that you can change both the data X and the labels y. This is important because many times you want to include smote in your pipeline but you want to smote just the training data, not the testing data. And with the imblearn pipeline, you can call smote in the pipeline to transform just X_train and y_train and not X_test and y_test.
So you can create an imblearn pipeline that has a smote sampler, pre-processing step, and svc.
For more details check out this stack overflow post here and machine learning mastery article here.

Resources