Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logmodel.fit(X_train, y_train) not working #333

Open
fb87fb opened this issue May 13, 2024 · 0 comments
Open

logmodel.fit(X_train, y_train) not working #333

fb87fb opened this issue May 13, 2024 · 0 comments

Comments

@fb87fb
Copy link

fb87fb commented May 13, 2024

Hi everyone,

I am working on a dataset and therefore, after typing the following instructions, I get back this error. I've tried to figure it out on Google, but as kinda newbie, I am getting so mad! I do appreciate your help.

X= titanic_data.drop("survived", axis=1)
y= titanic_data["survived"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1)
from sklearn.linear_model import LogisticRegression
logmodel=LogisticRegression()
logmodel.fit(X_train, y_train)

from here i get:


TypeError Traceback (most recent call last)
Cell In[72], line 1
----> 1 logmodel.fit(X_train, y_train)

File ~\anaconda3\Lib\site-packages\sklearn\linear_model_logistic.py:1196, in LogisticRegression.fit(self, X, y, sample_weight)

1193 else:
 1194     _dtype = [np.float64, np.float32]
-> 1196 X, y = self._validate_data(
 1197     X,
 1198     y,
 1199     accept_sparse="csr",
 1200     dtype=_dtype,
 1201     order="C",
 1202     accept_large_sparse=solver not in ["liblinear", "sag", "saga"],
 1203 )
 1204 check_classification_targets(y)
 1205 self.classes_ = np.unique(y)

File ~\anaconda3\Lib\site-packages\sklearn\base.py:548, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, **check_params)

483 def _validate_data(
  484     self,
  485     X="no_validation",
 (...)
  489     **check_params,
  490 ):
  491     """Validate input data and set or check the `n_features_in_` attribute.
  492 
  493     Parameters
 (...)
  546         validated.
  547     """
--> 548     self._check_feature_names(X, reset=reset)
  550     if y is None and self._get_tags()["requires_y"]:
  551         raise ValueError(
  552             f"This {self.__class__.__name__} estimator "
  553             "requires y to be passed, but the target y is None."
  554         )

File ~\anaconda3\Lib\site-packages\sklearn\base.py:415, in BaseEstimator._check_feature_names(self, X, reset)

395 """Set or check the `feature_names_in_` attribute.
 396 
 397 .. versionadded:: 1.0
(...)
 411        should set `reset=False`.
 412 """
 414 if reset:
--> 415     feature_names_in = _get_feature_names(X)
 416     if feature_names_in is not None:
 417         self.feature_names_in_ = feature_names_in

File ~\anaconda3\Lib\site-packages\sklearn\utils\validation.py:1903, in _get_feature_names(X)

   1901 # mixed type of string and non-string is not supported
   1902 if len(types) > 1 and "str" in types:
-> 1903     raise TypeError(
   1904         "Feature names are only supported if all input features have string names, "
   1905         f"but your input has {types} as feature name / column name types. "
   1906         "If you want feature names to be stored and validated, you must convert "
   1907         "them all to strings, by using X.columns = X.columns.astype(str) for "
   1908         "example. Otherwise you can remove feature / column names from your input "
   1909         "data, or convert them all to a non-string data type."
   1910     )
   1912 # Only feature names of all strings are supported
   1913 if len(types) == 1 and types[0] == "str":

TypeError: Feature names are only supported if all input features have string names, but your input has ['int', 'str'] as feature name / column name types. If you want feature names to be stored and validated, you must convert them all to strings, by using X.columns = X.columns.astype(str) for example. Otherwise you can remove feature / column names from your input data, or convert them all to a non-string data type.

What can I do to make it working?
Many thanks in advance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant